HowTo

How to measure performance using efxperf

The EFX file format and implementation have been designed to not only produce much smaller documents, but also parse and serialize much faster compared to XML.

The efxperf utility is used to benchmark real world performance differences between EFX and XML. This utility has been designed to simplify the comparison by automating the encode and decode benchmarking process. A variety of options are available to fine tune the benchmarking to closely match the type of scenarios used for a given application.

The following command runs efxperf and displays detailed help text describing all available options:

efxperf.cmd or efxperf.sh

efxperf.jar is distributed with EFX, and requires efx.jar, efxsdk.jar and efxlic.jar (with a valid EFX license) to be present.

Before using efxperf, it is necessary to obtain an XML document that is representative of the type of data that is used for a given application. In the absence of application specific test data, test.xml and its associated schema test.cxs can be used for testing purposes. This test data is taken from the W3C Efficient XML Interchange Working Group testing framework.

The following test walk-throughs assume that the sample XML file to be used is called test.xml.

In-memory performance testing

The simplest test to perform is measuring EFX performance relative to XML using an in-memory data source. This test measures the raw encode and decode speeds without taking into account the compaction benefits of EFX when used on a network.

The following command compares EFX and XML decode speeds:

efxperf test.xml

Sample output:

warmup: 1000 ms or 5 runs
main: 1000 ms or 5 runs

[1] file: test.xml
  schema: 
XML-DECODE warmup 1047ms / 9 runs = 116.33ms per run
XML-DECODE main 1032ms / 13 runs = 79.38ms per run
XML-SIZE 937005 bytes
EFX-DECODE warmup 1000ms / 314 runs = 3.18ms per run
EFX-DECODE main 1000ms / 323 runs = 3.09ms per run
EFX-SIZE 72933 bytes

   XML         EFX         XML          EFX
  SIZE*       SIZE*      DECODE**     DECODE**    FILENAME
--------    --------    ---------    ---------    -------------------------------
  937005       72933        79.38         3.09    test.xml

* Size measurements are in bytes.
** Timing measurements are in milliseconds

The "XML SIZE" and "EFX SIZE" columns contain size results, and the "XML DECODE" and "EFX DECODE" columns contain the performance results. For all columns, the lower the number, the better (smaller for size, and faster for decode.) The performance columns contain the time to parse an XML document or an EFX document in milliseconds. Although the time that is reported is for a single iteration, the results are calculated over multiple iterations. Also, a pre-determined "warmup" period is used to exclude startup costs and runtime compilation from affecting the results.

The following command compares EFX and XML encode speeds:

efxperf -e test.xml

Sample output:

warmup: 1000 ms or 5 runs
main: 1000 ms or 5 runs

[1] file: test.xml
  schema: 
XML-ENCODE warmup 1000ms / 28 runs = 35.71ms per run
XML-ENCODE main 1000ms / 45 runs = 22.22ms per run
XML-SIZE 923797 bytes
EFX-ENCODE warmup 1000ms / 103 runs = 9.70ms per run
EFX-ENCODE main 1000ms / 109 runs = 9.17ms per run
EFX-SIZE 72933 bytes

   XML         EFX         XML          EFX
  SIZE*       SIZE*      ENCODE**     ENCODE**    FILENAME
--------    --------    ---------    ---------    -------------------------------
  923797       72933        22.22         9.17    test.xml

* Size measurements are in bytes.
** Timing measurements are in milliseconds

The columns are similar to the ones described for decode. The "ENCODE" columns contain the encode time for a single iteration in milliseconds. The lower the number, the better (faster) the result.

Network performance testing

The compact size of EFX documents can yield measurable performance differences compared to XML. efxperf can be configured to use a network to get a much better idea of how EFX will perform in a real world environment. The same data file (test.xml) as used in the previous test is used, but an additional machine and a network will be necessary. The type of network (ethernet, wi-fi, cellular, etc) will greatly impact the results, making the selection of the network(s) an important consideration. The slower the network, the more pronounced the performance difference between the XML and EFX results.

This test walk-through will assume two machines, called host1 and host2, are available. host2 will run a relay, sending or receiving data over the network to host1 as appropriate. host1 needs efxperf.jar, efx.jar, efxsdk.jar and efxlic.jar; whereas, host2 needs only efxperf.jar and efx.jar.

To start the relay on host2 using port 8000, execute:

efxperfrelay 8000

Start the decode test on host1 using:

efxperf -net net://host2:8000 test.xml

Sample output:

network-host = hedwig:8000
warmup: 1000 ms or 5 runs
main: 1000 ms or 5 runs

[1] file: test.xml
  schema: 
XML-DECODE warmup 3875ms / 5 runs = 775.00ms per run
XML-DECODE main 1032ms / 11 runs = 93.81ms per run
XML-SIZE 937005 bytes
EFX-DECODE warmup 1000ms / 90 runs = 11.11ms per run
EFX-DECODE main 1000ms / 139 runs = 7.19ms per run
EFX-SIZE 72933 bytes

   XML         EFX         XML          EFX
  SIZE*       SIZE*      DECODE**     DECODE**    FILENAME
--------    --------    ---------    ---------    -------------------------------
  937005       72933        93.81         7.19    test.xml

* Size measurements are in bytes.
** Timing measurements are in milliseconds

The output is presented in the same format described above for the in-memory test.

While the relay on host2 is still running, producing encode results requires only the addition of the -e option on host1:

efxperf -e -net net://host2:8000 test.xml

Sample output:

network-host = hedwig:8000
warmup: 1000 ms or 5 runs
main: 1000 ms or 5 runs

[1] file: test.xml
  schema: 
XML-ENCODE warmup 1063ms / 11 runs = 96.63ms per run
XML-ENCODE main 1062ms / 13 runs = 81.69ms per run
XML-SIZE 923800 bytes
EFX-ENCODE warmup 1000ms / 95 runs = 10.52ms per run
EFX-ENCODE main 1000ms / 95 runs = 10.52ms per run
EFX-SIZE 72936 bytes

   XML         EFX         XML          EFX
  SIZE*       SIZE*      ENCODE**     ENCODE**    FILENAME
--------    --------    ---------    ---------    -------------------------------
  923800       72936        81.69        10.52    test.xml

* Size measurements are in bytes.
** Timing measurements are in milliseconds

Additional efxperf options and tips

The above tests demonstrate some of the common uses of efxperf. Additional optional options are available to further tailor the benchmark to closer match a use case.

A schema is typically used in conjunction with EFX. Use the -schema option to pass in a schema during the test. For example:

java -jar efxperf.jar -schema test.xsd test.xml

Common EFX settings can also be passed to efxperf to measure whether their use has an impact. For example, -zip efxperf option enables the -zip EFX option:

java -jar efxperf.jar -zip test.xml

Another useful test to perform is the performance impact of GZIP on top of XML encode/decode. This test is enabled with the -gzxml option. Please keep in mind that efxperf runs with the -efx and -xml modes by default when no mode is specified on the command line. To produce output for all three modes, use the following:

java -jar efxperf.jar -efx -xml -gzxml test.xml

By default, efxperf will run for a minimum amount of time and will ensure that a minimum number of iterations are performed. These minimum values are adjustable. For example, the -server hotspot Java compiler spends much more time with up front compilation, and giving efxperf a longer warmup period will prevent that compilation from skewing results. The -repeat, -time, -warmup, and -warmupTime options affect the execution time.

efxperf can also accept multiple input test files, and will report the results for each file as a separate row in the output. This is useful when comparing a large number of files, particularly when running the tests over a long period. For example:

java -jar efxperf.jar test.xml test2.xml

All available options are outlined below:

Usage: efxperf [efx-options] [efxperf-mode]* [efxperf-options] ([schema-options] )+
   or: efxperf -netrelay [-v|-verbose] 
            EfxPerf autodetects if the file is EFX or XML and
                        measures encode performance if the file is XML or
                        decode performance if the file is EFX.
  -netrelay (-net)      Act as network relay for -network tests
                        (disables normal timing tasks)
 Schema Options:
  -schema         Use XML Schema optimizations (from XSD or CXS schema)
                        for the following s
  -noschema             Use schemaless for the following 's
 EfxPerf Modes:
  -efx                  Time EFX SAX encode/decode (no text-XML)
  -transcode            Time EFX transcode from XML->EFX or EFX->XML
  -xml                  Time Text-XML encode/decode
  -gzxml                Time Text-XML + GZIP encode/decode
 EfxPerf Options:
  -decode (-d)          Time EFX->XML decode (default)
  -encode (-e)          Time XML->EFX encode
  -repeat (-r) <#>      Minimum run count (default: 5)
  -time (-t) <#>        Minimum run time (default: 5,000ms)
  -warmup (-wr) <#>     Minimum warmup run count (default: 1)
  -warmupTime (-wt) <#> Minimum warmup run time (default: 100ms)
  -network (-net)  User network I/O to specified host. Only hostname
                        and port-number matter.  If hostname='localhost'
                        efxperf automatically starts network relay
                        otherwise user must manually start relay using
                        'efxperf -netrelay' option.
                        example: -net net://localhost:8080
  -multithread (-mt)    Use EFXFactory rather than EFXSingleThreadFactory
  -quiet (-q)           Less runtime status
  -verbose (-v)         More verbose output
  -help (-?)            Extended help
 Common EFX Options:
  -lexical (-l)         Preserve the lexical structure of the XML document in
                        addition to the logical information it contains.
  -extensible (-x)      Allow XML documents that contain extensions to and
                        deviations from the specified XML Schema.
  -strict               Require XML documents to strictly follow schema.
  -zip                  Use data compression optimizations to reduce file size.