Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

JSON (and CBOR/MessagePack) Tabular and Timeseries Outputs #6192

Merged
merged 96 commits into from
Dec 21, 2018

Conversation

mbadams5
Copy link
Contributor

@mbadams5 mbadams5 commented Jun 30, 2017

Pull request overview

This PR is dependent on #5893 going in first since this is built off that branch. The following link provides a more accurate comparison of the features in this specific PR, seen here.

This PR adds JSON outputs for Tabular and Timeseries outputs. Due to the JSON library we are using for the JSON input processing, we can also output CBOR and MessagePack binary formats. Both of these formats provide faster serialization/deserialization and much smaller file sizes.

The JSON output format was built off of work completed a couple years ago and updated to current EnergyPlus source code and JSON library.

Running the Outpatient Reference Model for an annual simulation has the following file sizes:

eplusout.mtr - 2.1M
eplusmtr.csv - 1.6M
eplusout.eso - 31M
eplusout.csv - 24M
eplusout.sql - 31M

eplusout_hourly.json - 42M
eplusout_hourly.cbor - 12M (~71% reduction to JSON, 50% compared to CSV)

eplusout.json - 5.0M
eplusout.cbor - 1.2M (76% reduction)

Work Checklist

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Title of PR should be user-synopsis style (clearly understandable in a standalone changelog context)
  • At least one of the following appropriate labels must be added to this PR to be consumed into the changelog:
    • Defect: This pull request repairs a github defect issue. The github issue should be referenced in the PR description
    • Refactoring: This pull request includes code changes that don't change the functionality of the program, just perform refactoring
    • NewFeature: This pull request includes code to add a new feature to EnergyPlus
    • Performance: This pull request includes code changes that are directed at improving the runtime performance of EnergyPlus
    • DoNoPublish: This pull request includes changes that shouldn't be included in the changelog

Review Checklist

This will not be exhaustively relevant to every PR.

  • Code style (parentheses padding, variable names)
  • Functional code review (it has to work!)
  • If defect, results of running current develop vs this branch should exhibit the fix
  • CI status: all green or justified
  • Performance: CI Linux results include performance check -- verify this
  • Unit Test(s)
  • C++ checks:
    • Argument types
    • If any virtual classes, ensure virtual destructor included, other things
  • IDD changes:
    • Verify naming conventions and styles, memos and notes and defaults
    • Open windows IDF Editor with modified IDD to check for errors
    • If transition, add rules to spreadsheet
    • If transition, add transition source
    • If transition, update idfs
  • If new idf included, locally check the err file and other outputs
  • Documentation changes in place
  • Changed docs build successfully
  • ExpandObjects changes?
  • If output changes, including tabular output structure, add to output rules file for interfaces

sanyalj and others added 30 commits December 19, 2014 17:06
Initial time series data in JSON, output using cJSON
Conflicts:
	CMakeLists.txt
	src/EnergyPlus/CMakeLists.txt
Conflicts:
	tst/EnergyPlus/unit/PurchasedAirManager.unit.cc
# Conflicts:
#	cmake/CompilerFlags.cmake
#	src/EnergyPlus/CMakeLists.txt
#	src/EnergyPlus/EconomicLifeCycleCost.cc
#	src/EnergyPlus/EconomicTariff.cc
#	src/EnergyPlus/EnergyPlusPgm.cc
#	src/EnergyPlus/OutputProcessor.cc
#	src/EnergyPlus/OutputReportTabular.cc
Had to clear states before test runs.
@nrel-bot-2
Copy link

@mbadams5 it has been 8 days since this pull request was last updated.

@nrel-bot
Copy link

@mbadams5 @lgentile it has been 25 days since this pull request was last updated.

@mbadams5
Copy link
Contributor Author

@Myoldmopar @mjwitte I think this is nearly done. The encoding fixes I did on the inputs side were needed on the outputs side as well, as I have mentioned previously. One remaining issue that is solvable, but I don't particularly want to do it in this PR, is when the weather stats file has a different encoding than UTF-8.

Many (or all) of the included stats files in the weather directory are Western (Windows 1252) encoded, which causes issues when searching and outputting to UTF-8. I think I've fixed the code so that it will not fail with errors when it is encoded with Windows 1252, however, it does not output 100% correctly.

HTML output with Western (Windows 1252) encoded stats file
Windows-1252

HTML output with UTF-8 encoded stats file
UTF-8

Note the bad encoding of the degree symbol for latitude and longitude. Also all the lines related to Köppen classification and ASHRAE Handbook CDD/HDD are missing with the Western (Windows 1252) encoded stats file.

It can't find the Köppen classification because it is searching the Western (Windows 1252) line using an UTF-8 encoded string. We'd need to either re-write the specific parts to be encoding agnostic, read the file stats file as UTF-8, or convert the encoding of that specific input string to UTF-8.

@mjwitte
Copy link
Contributor

mjwitte commented Nov 15, 2018

For this particular table, it seems worthwhile to work around it. Would it work to search just parts of the string, leaving out the offending characters, and then output the full utf-8 string, not what's in the stat file? In fact, I'd prefer to see the line show up in the table with a blank value even if the data isn't found in the stat file.

@mbadams5
Copy link
Contributor Author

@mjwitte I get that, but this isn't specific to this PR. I just compiled develop which has the stat file at Western (Windows 1252) and OutputReportTabular.cc at Western (Windows 1252). When I change either the stat file or OutputReportTabular.cc to UTF-8 encoding, I get the following results:

Stat: Western (Windows 1252)
OutputReportTabular.cc: Western (Windows 1252)
Western (Windows 1252)/Western (Windows 1252)

Stat: UTF-8
OutputReportTabular.cc: Western (Windows 1252)
UTF-8/Western (Windows 1252)

Stat: UTF-8
OutputReportTabular.cc: UTF-8
UTF-8/UTF-8

Stat: Western (Windows 1252)
OutputReportTabular.cc: UTF-8
Western (Windows 1252)/UTF-8

@mjwitte
Copy link
Contributor

mjwitte commented Nov 15, 2018

@mbadams5 Sure, it can be a separate issue to make this table more robust, just do the right thing here for now and post a followup.

@Myoldmopar
Copy link
Member

@mbadams5 I pulled this locally, merged in latest develop, and am going to test it out. I am in the process of pointing E+ CI to the updated regression scripts, so that UTF-8 issue should go away very soon. Other than some final testing, are you ready for this to go in? @mjwitte anything else on your end?

@mjwitte
Copy link
Contributor

mjwitte commented Dec 12, 2018

@Myoldmopar The only part of this I've looked at/commented on is the encoding issue related to the Climatic Data Summary noted above. I've posted #7102 as a followup issue to fix this.

@Myoldmopar
Copy link
Member

I just ran the UTF8 test file on the new regression scripts with this branch and develop and it passes nicely. Which means that once CI is totally using the new scripts (later this week - maybe tomorrow), that error will go away. Nothing to be concerned about with this branch. Otherwise this branch seems OK. I'll do a final pass over the code and confirm it on the call in a few minutes before merging.

@Myoldmopar
Copy link
Member

@mbadams5 I tried this out this morning. I pulled latest develop in and re-built. I'll put some findings here as I go.

  • I added an Output:JSON, TimeSeries; object to 1ZoneUncontrolled.idf
    • Expected: Output files for each time series frequency
    • Outcome: No JSON output files
  • I then added a "Yes" to field A2 on the Output:JSON object, and re-ran, and got the outputs! This seems like a bug, as field A2 should be defaulted to Yes according to the IDD.

Spot checked some results and everything looks good and ran well. I think this is ready to drop in if you can address that issue. I'm still perusing the code one final time right now but it's good so far.

@mbadams5
Copy link
Contributor Author

@Myoldmopar I just pushed up a fix. See if that works.

@Myoldmopar
Copy link
Member

Pulled, built, and tested. JSON now outputs properly even if nothing is entered for field A2. I also reported cbor outputs, and imported them in python using a cbor package. It looks good when reported back out as normal JSON. I am happy with this.

I will let CI have a pass over it this afternoon and if it looks good, I'll merge it in. Thanks @mbadams5 !

@Myoldmopar
Copy link
Member

All green! Merging.

@Myoldmopar Myoldmopar merged commit 426dac9 into develop Dec 21, 2018
@Myoldmopar Myoldmopar deleted the output_processor_json branch December 21, 2018 00:21
@DavidHollman
Copy link
Contributor

... The following link provides a more accurate comparison of the features in this specific PR, seen here.

The link goes to a page that says "There isn’t anything to compare."

Is there an updated way to do that?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
IDDChange Code changes impact the IDD file (cannot be merged after IO freeze) NewFeature Includes code to add a new feature to EnergyPlus
Projects
None yet
Development

Successfully merging this pull request may close these issues.