1.
Varints are a method of serializing
integers using one or more bytes. Smaller numbers take a smaller number of
bytes. Each byte in a varint, except the last byte, has the most significant
bit
(msb) set – this indicates that there are further bytes to come. The
lower 7 bits of each byte are used to store the two's complement representation
of the number in groups of 7 bits, least significant group first. 1
-->
00000001 , 300
-->
10101100 00000010.
2.
The binary version of a message
just uses the field's number as the key – the name and declared type for each
field can only be determined on the decoding end by referencing the message
type's definition. When a message is encoded, the keys and values are
concatenated into a byte stream.
3.
When the message is being
decoded, the parser needs to be able to skip fields that it doesn't recognize.
The "key" for each pair in a wire-format message is actually two
values – the field number from your .proto file, plus a wire type
that
provides just enough information to find the length of the following value:
Type
|
Meaning
|
Used For
|
0
|
Varint
|
int32, int64, uint32, uint64, sint32,
sint64, bool, enum
|
|
|
|
1
|
64-bit
|
fixed64, sfixed64, double
|
|
|
|
2
|
Length-delimited
|
string, bytes, embedded messages, packed
repeated fields
|
|
|
|
3
|
Start group
|
groups (deprecated)
|
|
|
|
4
|
End group
|
groups (deprecated)
|
|
|
|
5
|
32-bit
|
fixed32, sfixed32, float
|
|
|
|
Each key in the streamed message is a varint with the value
(field_number << 3) | wire_type
– in other words, the last three bits of
the number store the wire type.
4.
There is an important
difference between the signed int types (sint32
and sint64
) and the
"standard" int types (int32
and int64
) when it comes to encoding
negative numbers. If you use int32
or int64
as the type for a negative number,
the resulting varint is always ten bytes long – it is, effectively, treated
like a very large unsigned integer. If you use one of the signed types, the
resulting varint uses ZigZag encoding, which is much more efficient.
5.
ZigZag encoding maps signed
integers to unsigned integers so that numbers with a small absolute value
(for instance, -1) have a small varint encoded value too. It does this in a way
that "zig-zags" back and forth through the positive and negative
integers, so that -1 is encoded as 1, 1 is encoded as 2, -2 is encoded as 3,
and so on.
6.
Non-varint numeric types are
stored in little-endian byte order.
7.
A wire type of 2
(length-delimited) means that the value is a varint encoded length. The tag number and wire type are followed by
the specified number of bytes of data.
8.
If your message definition has repeated
elements (without the [packed=true]
option), the encoded message has zero or
more key-value pairs with the same tag number. These repeated values do not
have to appear consecutively; they may be interleaved with other fields. The
order of the elements with respect to each other is preserved when parsing,
though the ordering with respect to other fields is lost.
9.
Normally, an encoded message
would never have more than one instance of an optional or required field.
However, parsers are expected to handle the case in which they do. For numeric
types and strings, if the same value appears multiple times, the parser accepts
the last value it sees. For embedded message fields, the parser merges multiple
instances of the same field, as if with the Message.MergeFrom
method – that is,
all singular scalar fields in the latter instance replace those in the former,
singular embedded messages are merged, and repeated fields are concatenated.
The effect of these rules is that parsing the concatenation of two encoded
messages produces exactly the same result as if you had parsed the two messages
separately and merged the resulting objects:
MyMessage message;
message.ParseFromString(str1 + str2);
is equivalent to this:
MyMessage message, message2;
message.ParseFromString(str1);
message2.ParseFromString(str2);
message.MergeFrom(message2);
10.
A packed repeated field
containing zero elements does not appear in the encoded message. Otherwise, all
of the elements of the field are packed into a single key-value pair with wire
type 2 (length-delimited). Each element is encoded the same way it would be
normally, except without a tag preceding it. Only repeated fields of primitive
numeric types (types which use the varint, 32-bit, or 64-bit wire types) can be
declared "packed".
11. W
hile you can use field numbers
in any order in a
.proto
, when a message is serialized its known fields should be written
sequentially by field number. This allows parsing code to use optimizations
that rely on field numbers being in sequence. However, protocol buffer parsers
must be able to parse fields in any order, as not all messages are created by
simply serializing an object – for instance, it's sometimes useful to merge two
messages by simply concatenating them.
12.
If a message has unknown
fields
, the current Java implementations write them in arbitrary order
after the sequentially-ordered known fields.
分享到:
相关推荐
LoadModule encoding_module modules/mod_encoding.so Header add MS-Author-Via "DAV" <IfModule mod_encoding.c> EncodingEngine on NormalizeUsername on SetServerEncoding GBK ...
赠送jar包:parquet-encoding-1.8.2.jar; 赠送原API文档:parquet-encoding-1.8.2-javadoc.jar; 赠送源代码:parquet-encoding-1.8.2-sources.jar; 赠送Maven依赖信息文件:parquet-encoding-1.8.2.pom; 包含...
EncodingConverter是一个功能强大的文本文件编码(Encoding)转换工具。 * 能够批量、快速地转换大多数文本文件的编码。 * 支持40多种编码(Encoding)类型。比如:GB2312的编码文件转换到UTF-8编码等等。 * 可...
encoding.js, 在JavaScript中,转换字符编码 encoding.js 在JavaScript中转换字符编码。README(Japanese)安装浏览器中的插件:[removed][removed]或者
标签:apache-any23-encoding-0.8.0-javadoc.jar,apache,any23,encoding,0.8.0,javadoc,jar包下载,依赖包
Gma.QrCodeNet.Encoding带XML注释,各版本都有(2.0-4.5),已使用,不会使用的可以私聊我。
Gma.QrCodeNet.Encoding.dll用于在Grapghic中绘制Qrcode,具体使用见博客: https://blog.csdn.net/HorseRoll/article/details/80498233
字符编码转换器(Encoding Tool)EncodingTool
mod_encoding 让apache支持中文路径
linux 64位系统 mod_encoding解决中文文件名不能访问的问题
果断使用了一把text-encoding,始终不对。 https://github.com/inexorabletash/text-encoding 搞了2-3个小时,为毛不能进入函数调用里面,文档里面说是polliy.js,要提前把TextEncoder干掉,突然想起来了...
python库,解压后可用。 资源全名:torch_encoding-1.2.1b20200620-py2.py3-none-any.whl
脚本可以批量去除XML文件中的“1.0“ encoding=“utf-8“?>
附件提供解决办法。response.setCharacterEncoding("gb2312");编译没有错误,但运行时报错:NoSuchMethodError setCharacterEncoding(Ljava/lang/String;)V
León van de Pavert REED-SOLOMON ENCODING AND DECODING A Visual Representation
谷歌设置编码插件SetCharacterEncoding,例如可以修改为UFT-8编码
encoding 文本文件打开的编码方式 入门C#时经常看到这样的描述:.NET中的String都是Unicode编码。 在入门之后没太看这样的基础书并且多接触一些编码问题后,我的潜意识总觉得String有很多种编码,utf8,unicode,...
Specification of Packed Encoding Rules (PER).pdf T-REC-X.691-200811-I标准
如何通过高效的encoding类设计来处理字符及字符串的处理