OpenXLSX-NX (Next Generation) is an advanced, high-performance C++ library for reading, writing, creating, and modifying Microsoft Excel® files in the .xlsx format. It is designed to be fast, cross-platform, and has minimal external dependencies.
- Modern C++: Built with C++17, ensuring type safety and modern abstractions.
- Enterprise Security: Hardened against malicious inputs with integrated LLVM libFuzzer for AST parsing, and zero-leak memory guarantees verified by ASan/UBSan.
- Cross-Engine Compatibility: Rigorously tested against real-world
.xlsxfiles generated by openpyxl (Python), excelize (Go), Google Sheets, and Tencent Docs to ensure lossless reading, modification, and re-saving of proprietary XML extensions. - Streaming with Styles: Write millions of rows with minimal memory overhead (O(1) memory complexity) while applying rich cell styles on the fly via
XLStreamWriter::appendRow. - Formula Engine: Built-in lightweight AST parser and evaluation engine (
XLFormulaEngine) supporting 120+ standard Excel functions (Math, Logic, Text, Statistical, Financial, Date/Time, Lookup, etc.) without requiring external Excel calculation. Fully cross-validated against Microsoft Excel and Go's Excelize library for mathematically identical deterministic output. - Advanced Charts: Fluent Builder API for generating complex 2D/3D charts (
XLChart), supporting series styling, legends, and strict OOXML schema compliance. - Dynamic Structural Mutation: Insert or delete rows and columns programmatically (
insertRow,deleteColumn) with automatic coordinate shifting. - Smart Style Deduplication: O(1) hash-based global style pool prevents
styles.xmlbloat and file corruption when applying bulk formatting to massive datasets. - High Performance: Optimized for speed, capable of processing millions of cells per second.
- Zero-Dependency Core: All dependencies (
libzip,pugixml,fmt,fast_float) are integrated via CMake'sFetchContentor standard library. - Unicode & Historical Bugs Support: Consistent UTF-8 handling and precise emulation of Excel's historical quirks (e.g., the 1900 Leap Year bug) to guarantee data fidelity.
- Ergonomic Facade APIs: "Python/Go-like" C++ developer experience. Use declarative
XLStylebuilder to style cells instantly, auto-fit column widths, or freeze panes intuitively (freezePanes("B2")). - Comprehensive Image Support: Insert images via simple APIs (
wks.insertImage("A1", "logo.png")) with zero-dependency automatic resolution detection and dynamic aspect-ratio scaling (XLImageOptions). - Batch & Matrix Operations: Bulk-fill 2D matrices (
std::vector<std::vector<T>>) into regions usingXLCellRange::setValue()withoutforloops, and apply bulk styling/border-outlining instantly. - Advanced Data Validation: Fluent-API builder for generating dropdown lists, region subtraction, cross-sheet references, and error alerts (
requireList({"A", "B"})). - Data Tables & AutoFilters: High-level API for creating and managing Excel Tables (
XLTables), including totals rows, column-level aggregate functions (Sum,Average, etc.), and custom logical AutoFilters. - Modern Testing: Integrated with Catch2 v3.14.0 utilizing Data-Driven Generators (
GENERATE) for robust verification.
#include <OpenXLSX.hpp>
#include <iostream>
using namespace OpenXLSX;
int main() {
XLDocument doc;
doc.create("Example.xlsx", XLForceOverwrite);
auto wks = doc.workbook().worksheet("Sheet1");
// Easy styling and data writing
wks.cell("A1").value() = "Hello, OpenXLSX!";
XLStyle style;
style.font.bold = true;
style.font.color = XLColor("FF0000"); // Red text
wks.cell("A1").setStyle(style);
// AutoFit Column
wks.autoFitColumn(1);
doc.save();
return 0;
}Results based on the OpenXLSXBenchmark suite running against an 800,000-cell dataset (100,000 rows × 8 columns) on a Release build.
| Operation | Time / Execution | Throughput | Description |
|---|---|---|---|
| Write Integers | 202 ms | ~3.9M cells/sec | Direct integer-to-XML serialization without string intermediates. |
| Write Booleans | 229 ms | ~3.5M cells/sec | High-speed boolean state serialization. |
| Write Strings | 276 ms | ~2.9M cells/sec | Zero-copy SharedStrings aggregation using O(1) hash maps. |
| Write Floats | 282 ms | ~2.8M cells/sec | Fast double formatting via std::to_chars and {fmt}. |
| Read Strings | 177 ms | ~4.5M cells/sec | Parsing and lookup via ankerl::unordered_dense. |
| Read Integers | 144 ms | ~5.5M cells/sec | Rapid DOM extraction and value parsing. |
| Random DOM Access | 3.4 ms | N/A | Backward column/row traversal via O(1) XML Hint Cache. |
| Style Deduplication | 136 ms | N/A | Generating, hashing, and deduping thousands of complex nested styles. |
| Formula Engine | 12.9 ms | N/A | XLFormulaEngine AST parsing and execution. |
The unmatched throughput of OpenXLSX is achieved through continuous low-level C++17 optimizations:
- Zero-Allocation Stream Writer: The
XLStreamWritermodule relies entirely on stack buffers and<charconv>(std::to_chars), cutting millions of redundantstd::stringallocations when appending rows. - Zero-Copy XML Serialization: Massive XML files (e.g., 80MB worksheets) bypass intermediate
std::ostringstreambuffers. They are streamed viaMallocXmlWriterdirectly into a resizing heap block, which is then handed off (viazip_source_buffer_create) tolibzipwithout any memory duplication (memcpy). - O(1) Style Hash Cache: Styling 1,000,000 cells identically results in exactly 1 XML node inside
styles.xml. The library intercepts all.setStyle()calls, serializes them in an isolated temporary DOM, generates a fingerprint, and performs an O(1) cache lookup to prevent XML bloat and file corruption. - Lazy DOM Updates: Eliminates O(N) operations during file saving (like
XLWorksheet::columnCount()) by maintaining boundary limits via an O(1) dirty-flag state machine. - Fast Startup: Opening a
.xlsxwith thousands of embedded resources uses a pre-allocated hash set to cross-reference unhandled components, dropping load times from O(N²) down to O(1).
#include <OpenXLSX.hpp>
#include <iostream>
using namespace OpenXLSX;
int main() {
XLDocument doc;
doc.create("SalesTable.xlsx", XLForceOverwrite);
auto wks = doc.workbook().worksheet("Sheet1");
// 1. Prepare data (Headers + 3 rows)
wks.cell("B2").value() = "Product";
wks.cell("C2").value() = "Sales";
wks.cell("B3").value() = "Apples"; wks.cell("C3").value() = 5000;
wks.cell("B4").value() = "Oranges"; wks.cell("C4").value() = 4200;
wks.cell("B5").value() = "Bananas"; wks.cell("C5").value() = 7500;
// 2. Create a Table and snap it to the data
auto& table = wks.tables();
table.setName("SalesTable");
table.resizeToFitData(wks); // Automatically detects B2:C5
// 3. Enable Totals Row and expand range (B2:C6)
table.setRangeReference("B2:C6");
table.setShowTotalsRow(true);
table.setStyleName("TableStyleMedium9");
// 4. Configure Column-level Aggregate Functions
auto col1 = table.appendColumn("Product");
col1.setTotalsRowLabel("Total:");
wks.cell("B6").value() = "Total:";
auto col2 = table.appendColumn("Sales");
col2.setTotalsRowFunction(XLTotalsRowFunction::Sum);
// Use SUBTOTAL formula for dynamic updates when filtering
wks.cell("C6").formula() = "SUBTOTAL(109,SalesTable[Sales])";
// 5. Apply a Custom AutoFilter (Sales > 4500)
auto filter = table.autoFilter();
filter.filterColumn(1).setCustomFilter("greaterThan", "4500");
doc.save();
return 0;
}OpenXLSX uses a simplified CMake (3.15+) build system.
The recommended way to use OpenXLSX is via CMake's FetchContent:
include(FetchContent)
FetchContent_Declare(
OpenXLSX_NX
GIT_REPOSITORY https://github.com/twn39/OpenXLSX-NX.git
GIT_TAG master # Or a specific tag
)
FetchContent_MakeAvailable(OpenXLSX_NX)
target_link_libraries(my_project PRIVATE OpenXLSX::OpenXLSX) # 100% backward compatible targetTo build the library and tests locally:
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build . -j8The library features a comprehensive test suite. To run the tests after building:
# From the build directory
./OpenXLSXTestsAll string input and output must be in UTF-8 encoding. OpenXLSX uses std::filesystem to handle cross-platform path conversion (including UTF-8 on Windows). Ensure your source files are saved in UTF-8.
- Sheet Indexing: 1-based (consistent with Excel).
- Row/Column Indexing: Generally 1-based where it follows Excel conventions.
The build system includes platform-specific optimizations for Release builds (e.g., /O2 on MSVC, -O3 on GCC/Clang) and supports LTO (Link-Time Optimization) which can be toggled via OPENXLSX_ENABLE_LTO.
- PugiXML - Fast XML parsing.
- libzip & zlib-ng - Fast and compatible ZIP archive handling.
- fmt - Modern formatting library.
- fast_float - Fast floating-point parsing.
OpenXLSX-NX is engineered for extreme performance. Below are the benchmark results measured on an Apple Silicon (ARM64) processor using Catch2.
| Operation | Data Type | Average Time | Throughput |
|---|---|---|---|
| Write | Strings (std::string) |
~685 ms | 1,167,000 cells/sec |
| Write | Integers (int64_t) |
~546 ms | 1,465,000 cells/sec |
| Write | Floats (double) |
~814 ms | 982,000 cells/sec |
| Write | Booleans (bool) |
~590 ms | 1,355,000 cells/sec |
| Read | Strings | ~486 ms | 1,646,000 cells/sec |
| Read | Integers | ~345 ms | 2,318,000 cells/sec |
| Component | Test Description | Average Time | Operations/sec |
|---|---|---|---|
| Formula Engine | Lexing, parsing, and evaluating SUM(A1:A3) 10,000 times. |
~39 ms | 256,000 evals/sec |
| Style Pool | Invoking findOrCreateStyle(s) 50,000 times to deduplicate identical complex styles. |
~584 ms | 85,000 lookups/sec |
Note: Benchmarks can be compiled by setting -DOPENXLSX_BUILD_BENCHMARKS=ON in CMake.
auto range = wks.range("A1:C3");
// One-liner to dump a 2D matrix into the spreadsheet
std::vector<std::vector<int>> matrix = { {1,2,3}, {4,5,6}, {7,8,9} };
range.setValue(matrix);
// One-liner to extract a 2D matrix back to C++
auto extracted = range.getValue<int>();
// Draw a thick blue border ONLY around the outer perimeter of the 3x3 matrix
range.setBorderOutline(XLLineStyleThick, XLColor("0000FF"));Stop jumping between Worksheet and Cell scopes. Manage values, formats, and Excel 365 modern threaded discussions in one breath.
doc.setDefaultAuthor("System Manager");
wks.cell("A1").setValue("Audit Required")
.addNote("Please check this ASAP.") // Legacy yellow note
.getStyle().setFontColor(XLColor::Red);
// Object-Oriented Reply Cascades for modern Excel 365 threaded comments
wks.cell("C3").addComment("Is the Q3 data finalized?", "Alice")
.addReply("Uploading now.", "Bob")
.addReply("Got it.", "Alice").setResolved(true);Never worry about std::bad_alloc again. XLStreamReader uses a micro-DOM sliding window to read massive files with virtually zero memory overhead.
auto reader = doc.workbook().worksheet("MassiveData").streamReader();
while (reader.hasNext()) {
auto row = reader.nextRow();
std::cout << "Read row " << reader.currentRow() << " with " << row.size() << " columns." << std::endl;
}Insert any png, jpg, or gif using natural cell coordinates. The engine automatically parses the binary header to detect the image dimensions without depending on OpenCV or libpng.
XLImageOptions opts;
opts.scaleX = 0.5; // Scale to 50%
opts.positioning = XLImagePositioning::TwoCell; // Stretch with cell bounds
wks.insertImage("B2", "company_logo.png", opts);Build complex dropdown lists and warnings with method chaining.
wks.dataValidations().add("C2:C100")
.requireList({"Pending", "Approved", "Rejected"})
.setErrorAlert("Invalid", "Please select a valid state from the list.");// Instantly freeze Row 1 and Column A
wks.freezePanes("B2");
// Make Column B width auto-adapt to fit the longest text inside it
wks.autoFitColumn(2);
// Hide Column C
wks.column("C").setHidden(true);// Write a formula
wks.cell("E1").formula() = "SUM(A1:D1)";
// Merge multiple cells
wks.mergeCells("A1:D1");OpenXLSX features a thread-safe two-tier lock architecture that enables high-performance parallel processing, allowing multiple threads to write to different worksheets simultaneously without data races.
| Operation | Safety | Note |
|---|---|---|
| Multiple threads writing to different worksheets | ✅ Safe | Each worksheet has its own DOM |
save() during concurrent writes |
✅ Safe | Automatically blocks until writes finish |
Structural changes (addWorksheet, deleteSheet) |
✅ Safe | Protected by document-level exclusive lock |
| Shared string table contention (text insertion) | ✅ Safe | Protected by dedicated SST mutex |
| Multiple threads writing to the same worksheet | ❌ Unsafe | Pugixml DOM limitation |
// 1. Thread 1 writing to Sheet 1
std::thread t1([&]() {
auto wks1 = doc.workbook().worksheet("Sheet1");
wks1.cell("A1").value() = "Written from Thread 1";
});
// 2. Thread 2 writing to Sheet 2 concurrently!
std::thread t2([&]() {
auto wks2 = doc.workbook().worksheet("Sheet2");
wks2.cell("A1").value() = "Written from Thread 2";
});
t1.join();
t2.join();
// 3. Save operation safely blocks and waits for all writes to complete
doc.save();Note: The underlying Shared String Table is automatically protected by a dedicated mutex, making concurrent text insertions entirely thread-safe. Writing to the same worksheet from multiple threads is not supported.
Evaluate formulas directly in C++ without needing MS Excel to recalculate the file.
wks.cell("A1").value() = 10.5;
wks.cell("A2").value() = 20.2;
wks.cell("B1").formula() = "SUM(A1:A2)";
wks.cell("B2").formula() = "IF(A1>10, "High", "Low")";
XLFormulaEngine engine;
auto resolver = XLFormulaEngine::makeResolver(wks);
// Evaluates to 30.7
double sumResult = engine.evaluate(wks.cell("B1").formula().get(), resolver).get<double>();
// Evaluates to "High"
std::string logicResult = engine.evaluate(wks.cell("B2").formula().get(), resolver).getString();Insert or delete rows and columns on the fly. Existing data and coordinates shift automatically.
// Insert 2 blank rows starting at row 5 (existing row 5 becomes row 7)
wks.insertRow(5, 2);
// Delete column C (existing column D shifts left to become C)
wks.deleteColumn(3, 1); Optimize save performance or file size by setting the internal ZIP compression level.
// 0: Store (Fastest, Largest file)
// 1: Fast (Default, Best balance)
// 9: Best (Slowest, Smallest file)
doc.setCompressionLevel(1);
doc.save();Detailed Change Log
- Save Optimization: Added
setCompressionLevel()to allow fine-tuning the ZIP deflation ratio. The default has been set to 1 (Fastest) to prioritize maximum save performance without unnecessarily re-compressing structural XML files.
- 100% Deterministic Parity: Executed full data-driven validation against the Go Excelize math engine. Achieved 100% equivalence on over 120 supported Math, Logic, Date/Time, and Financial functions.
- Dimensional Lookup Resolution: Completely refactored
VLOOKUPandHLOOKUPto natively support 2D matrix bounding (rows()andcols()), fixing heuristic stride parsing issues found in standard range extraction. - Date/Time Edge Cases: Removed the strict
>= 1.0serial check to properly support Excel's time-only (< 1.0) serial calculation, ensuringHOUR,MINUTE, andSECONDresolve time intervals accurately. - Advanced Parameter Parsing: Fixed deep AST lexical bugs related to boolean function parameters (
AND,OR), empty string array inclusions (COUNTA), string-to-number interpolations (AVERAGEA), and sign polarity rules in financial compounding (NPER).
- Extended Function Support (
XLFormulaEngine): Added robust implementations for nearly 30 new, highly-requested standard Excel functions covering advanced mathematics, statistical analysis, date logic, and corporate finance. - Math & Logic: Added support for
MROUND,CEILING.MATH,FLOOR.MATH,TRUE, andFALSE. - Advanced Statistics: Implemented array-capable variance and standard deviation calculations (
VAR.P,VARP,STDEV.P,STDEVP,VARA,VARPA,STDEVA,STDEVPA), alongsidePERMUT,PERMUTATIONA,FISHER,FISHERINV,STANDARDIZE,PEARSON,RSQ, andAVERAGEIFS. - Business Date & Time: Added calculation logic for ISO standard and US/European trailing weeks via
ISOWEEKNUM,WEEKNUM, and 360-day year banking periods viaDAYS360. - Corporate Finance: Built out comprehensive financial modeling functions including
NPER(number of periods),DB(fixed-declining balance), andDDB(double-declining balance). - OOXML Extension (
_xlfn.) Compatibility: Modern functions introduced in Excel 2010+ automatically receive the_xlfn.namespace prefix (e.g.,_xlfn.CEILING.MATH,_xlfn.VAR.P) seamlessly, ensuring files modified or generated by OpenXLSX display properly natively in Microsoft Excel without #NAME? errors.
- Streaming with Styles (
XLStreamWriter): Solved the massive memory overhead bottleneck in enterprise exports. IntroducedXLStreamCelland heavily refactoredappendRowto support applying complexXLStyleIndexformats while streaming data at gigabytes per second with true O(1) memory complexity. - Fluent Rich Text API (
XLRichText): Overhauled the API to support method chaining (rt.addRun("Bold").setBold().setFontColor(...)) for effortless, multicolored cell generation. - Cross-Engine Compatibility (
testXLCompatibility): Established a rigorous testing matrix validating OpenXLSX's ability to losslessly parse and rewrite.xlsxfiles generated by openpyxl, excelize, Google Sheets, and Tencent Docs without corrupting their proprietary XML extensions. - LLVM libFuzzer Integration: Hooked the
XLFormulaEngineparser into Clang's libFuzzer to provide military-grade protection against infinite recursion, buffer overflows, and stack exhaustion caused by maliciously crafted formulas. - ASan/UBSan Memory Shield: Integrated AddressSanitizer and UndefinedBehaviorSanitizer directly into the CMake build lifecycle. Eliminated a lurking C++ move-semantics Undefined Behavior in
XLStylesand validated zero memory leaks across 220,000+ test assertions. - Edge Case & Exploit Hardening: Implemented targeted defense tests against Zip Slip/XXE (Billion Laughs) attacks, fixed strict adherence to Excel's 1900 Leap Year bug (serial
60.0), enforcedxml:space="preserve"bounds, and resolved string-to-number coercion edge cases.
- Formula Engine (
XLFormulaEngine): Introduced a standalone AST parser and evaluation engine supporting 45+ built-in functions (Math, Logic, Text). Calculates results in C++ without Excel. - Chart Builder API: Added robust support for creating complex charts with custom series styling and strictly compliant OOXML rendering.
- Dynamic Structural Mutation: Implemented
insertRow,deleteRow,insertColumn, anddeleteColumnwith full coordinate shifting logic. - O(1) Style Deduplication: Implemented a canonical XML fingerprinting and hash-based caching mechanism (
findOrCreateStyle) across all format sub-pools (Fonts, Fills, Borders) to eliminatestyles.xmlbloat during bulk operations. - Concurrency: Fixed race conditions in multi-threaded read environments.
- Table High-level API: Implemented a comprehensive
XLTablesinterface for creating and managing Excel Tables. - Totals Row Support: Added support for
showTotalsRow,totalsRowCount, and column-level aggregate functions (Sum,Average,Count, etc.) viaXLTableColumn. - Advanced AutoFilters: Enhanced
XLAutoFilterto support custom logical rules (e.g.,greaterThan) and value-based filtering. - Ergonomic Resizing: Added
resizeToFitData()to automatically snap table boundaries to contiguous worksheet data. - OOXML Compliance Fixes: Standardized boolean XML attributes to
1/0and optimized node ordering to ensure 100% compatibility with MS Excel's strict validation. - Performance & Testing: Added comprehensive Catch2 unit tests and OOXML structure verification tests.
- Modernized File Creation: Completely abandoned the hardcoded, 7.7KB hex-encoded binary
.xlsxtemplate. Migrated to dynamic,constexpr std::string_viewXML string templates (inspired by Excelize), significantly improving code readability and maintainability without sacrificing C++ performance. - Privacy & Metadata Scrubbing: Removed legacy workaround code that was previously necessary to scrub the original author's local file paths and revision histories embedded within the old binary payload. The new XML templates guarantee a 100% clean and pristine initial state for all generated documents.
- Zip Archive Reliability: Fixed a bug where files dynamically injected into memory during creation were occasionally discarded upon
save()due to missing internal modification flags inXLZipArchive. - Enhanced Validation Compliance: Embedded strict MS Excel required namespaces and
xl/theme/theme1.xmlto prevent any "file corruption / recovery" warnings when opening programmatically generated files.
- Implemented Rich Text: Added
XLRichTextandXLRichTextRunfor multi-format text segments within cells, including support for font colors and styles. - AutoFilter Support: New API to set and manage worksheet filters.
- Workbook Defined Names: Implemented
XLDefinedNamesfor managing global and local named ranges. - Page Setup & Print Options: Added comprehensive control over margins, orientation, paper size, and print-specific settings.
- Granular Sheet Protection: Enhanced protection with granular control over user permissions (sorting, formatting, filtering).
- DateTime Overhaul: Enhanced
XLDateTimewithstd::chronosupport,fromString/toStringmethods, and a convenientnow()function. - Internal OOXML Verification: Migrated structure verification tests to use internal APIs, removing external dependencies like
unzipfor testing. - Stability Fixes: Resolved potential segmentation faults in protection property access and fixed symbol conflicts in test builds.
- Removed
nowideDependency: Migrated to C++17std::filesystem::u8pathfor cross-platform UTF-8 path support, reducing the library footprint and simplifying build logic. - Fixed Document Properties: Resolved issues where Title, Subject, and Creator were not correctly updated in
core.xmldue to OOXML namespace handling. - XML Format Optimization: Switched to standard XML formatting to ensure XML declarations are preserved, improving compatibility with Excel and other spreadsheet viewers.
- CI Enhancement: Added caching for dependencies in GitHub Actions to speed up build times.
- Unified Build System: Consolidated all sub-module CMake configurations into a single root
CMakeLists.txt. - Optimization Suite: Added LTO support and platform-specific Release optimizations (
/O2,-O3, dead-code stripping). - Image Support: Implemented
XLDrawingand aspect-ratio aware image insertion. - Data Validation: Enhanced data validation support with full CRUD operations and Excel-compliant XML serialization.
- Enhanced Testing: Merged test suite into main build flow with automatic test data handling.