Difference between revisions of "Hypergraph Format"

Revision as of 11:18, 8 November 2010

Pro:

Con:

Pro:

Con:

No implementations for Perl, C#, or other languages commonly used by NLP folks
Requires a separate library; adds an external dependency to spec
"It's really easy to get up to some of the data size limits that are in place to prevent malicious data from having the PB parser allocate too much memory". Some of the limits are described in the section describing SetTotalBytesLimit on this page.
"You typically have to create a full hypergraph protocol buffer object before you can serialize it, so you either have to use the PB data structures internally in your code or you have to copy your data structure. While doing this copy, you can end up with two copies of the forest in memory, which is bad for memory usage."

Pro:

Con:

@@ Line 10: / Line 10: @@
 Con:
 * Space inefficiency
+* Requires custom parser for speed
 == Protocol Buffers ==
@@ Line 21: / Line 22: @@
 * Very fast to read (particularly in C++ and Java, hopefully soon in python)
 * Very space efficient
-* Implementations in every language (although requires a separate library)
+* Implementations in Java, C++ and Python
 * Automatically generates typed stubs
 Con:
+* No implementations for Perl, C#, or other languages commonly used by NLP folks
+* Requires a separate library; adds an external dependency to spec
 * "It's really easy to get up to some of the data size limits that are in place to prevent malicious data from having the PB parser allocate too much memory". Some of the limits are described in the section describing SetTotalBytesLimit on [http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.io.coded_stream.html this page].
 * "You typically have to create a full hypergraph protocol buffer object before you can serialize it, so you either have to use the PB data structures internally in your code or you have to copy your data structure. While doing this copy, you can end up with two copies of the forest in memory, which is bad for memory usage."