mirror of
https://github.com/didi/KnowStreaming.git
synced 2026-01-12 19:12:48 +08:00
Add km module kafka gateway
This commit is contained in:
261
clients/build/resources/main/common/message/README.md
Normal file
261
clients/build/resources/main/common/message/README.md
Normal file
@@ -0,0 +1,261 @@
|
||||
Apache Kafka Message Definitions
|
||||
================================
|
||||
|
||||
Introduction
|
||||
------------
|
||||
The JSON files in this directory define the Apache Kafka message protocol.
|
||||
This protocol describes what information clients and servers send to each
|
||||
other, and how it is serialized. Note that this version of JSON supports
|
||||
comments. Comments begin with a double forward slash.
|
||||
|
||||
When Kafka is compiled, these specification files are translated into Java code
|
||||
to read and write messages. Any change to these JSON files will trigger a
|
||||
recompilation of this generated code.
|
||||
|
||||
These specification files replace an older system where hand-written
|
||||
serialization code was used. Over time, we will migrate all messages to using
|
||||
automatically generated serialization and deserialization code.
|
||||
|
||||
Requests and Responses
|
||||
----------------------
|
||||
The Kafka protocol features requests and responses. Requests are sent to a
|
||||
server in order to get a response. Each request is uniquely identified by a
|
||||
16-bit integer called the "api key". The API key of the response will always
|
||||
match that of the request.
|
||||
|
||||
Each message has a unique 16-bit version number. The schema might be different
|
||||
for each version of the message. Sometimes, the version is incremented even
|
||||
though the schema has not changed. This may indicate that the server should
|
||||
behave differently in some way. The version of a response must always match
|
||||
the version of the corresponding request.
|
||||
|
||||
Each request or response has a top-level field named "validVersions." This
|
||||
specifies the versions of the protocol that our code understands. For example,
|
||||
specifying "0-2" indicates that we understand versions 0, 1, and 2. You must
|
||||
always specify the highest message version which is supported.
|
||||
|
||||
The only old message versions that are no longer supported are version 0 of
|
||||
MetadataRequest and MetadataResponse. In general, since we adopted KIP-97,
|
||||
dropping support for old message versions is no longer allowed without a KIP.
|
||||
Therefore, please be careful not to increase the lower end of the version
|
||||
support interval for any message.
|
||||
|
||||
MessageData Objects
|
||||
-------------------
|
||||
Using the JSON files in this directory, we generate Java code for MessageData
|
||||
objects. These objects store request and response data for kafka. MessageData
|
||||
objects do not contain a version number. Instead, a single MessageData object
|
||||
represents every possible version of a Message. This makes working with
|
||||
messages more convenient, because the same code path can be used for every
|
||||
version of a message.
|
||||
|
||||
Fields
|
||||
------
|
||||
Each message contains an array of fields. Fields specify the data that should
|
||||
be sent with the message. In general, fields have a name, a type, and version
|
||||
information associated with them.
|
||||
|
||||
The order that fields appear in a message is important. Fields which come
|
||||
first in the message definition will be sent first over the network. Changing
|
||||
the order of the fields in a message is an incompatible change.
|
||||
|
||||
In each new message version, we may add or subtract fields. For example, if we
|
||||
are creating a new version 3 of a message, we can add a new field with the
|
||||
version spec "3+". This specifies that the field only appears in version 3 and
|
||||
later. If a field is being removed, we should change its version from "0+" to
|
||||
"0-2" to indicate that it will not appear in version 3 and later.
|
||||
|
||||
Field Types
|
||||
-----------
|
||||
There are several primitive field types available.
|
||||
|
||||
* "boolean": either true or false.
|
||||
|
||||
* "int8": an 8-bit integer.
|
||||
|
||||
* "int16": a 16-bit integer.
|
||||
|
||||
* "int32": a 32-bit integer.
|
||||
|
||||
* "int64": a 64-bit integer.
|
||||
|
||||
* "float64": is a double-precision floating point number (IEEE 754).
|
||||
|
||||
* "string": a UTF-8 string.
|
||||
|
||||
* "bytes": binary data.
|
||||
|
||||
In addition to these primitive field types, there is also an array type. Array
|
||||
types start with a "[]" and end with the name of the element type. For
|
||||
example, []Foo declares an array of "Foo" objects. Array fields have their own
|
||||
array of fields, which specifies what is in the contained objects.
|
||||
|
||||
For information about how fields are serialized, see the [Kafka Protocol
|
||||
Guide](https://kafka.apache.org/protocol.html).
|
||||
|
||||
Nullable Fields
|
||||
---------------
|
||||
Booleans, ints, and floats can never be null. However, fields that are strings,
|
||||
bytes, or arrays may optionally be "nullable." When a field is "nullable," that
|
||||
simply means that we are prepared to serialize and deserialize null entries for
|
||||
that field.
|
||||
|
||||
If you want to declare a field as nullable, you set "nullableVersions" for that
|
||||
field. Nullability is implemented as a version range in order to accomodate a
|
||||
very common pattern in Kafka where a field that was originally not nullable
|
||||
becomes nullable in a later version.
|
||||
|
||||
If a field is declared as non-nullable, and it is present in the message
|
||||
version you are using, you should set it to a non-null value before serializing
|
||||
the message. Otherwise, you will get a runtime error.
|
||||
|
||||
Tagged Fields
|
||||
-------------
|
||||
Tagged fields are an extension to the Kafka protocol which allows optional data
|
||||
to be attached to messages. Tagged fields can appear at the root level of
|
||||
messages, or within any structure in the message.
|
||||
|
||||
Unlike mandatory fields, tagged fields can be added to message versions that
|
||||
already exists. Older servers will ignore new tagged fields which they do not
|
||||
understand.
|
||||
|
||||
In order to make a field tagged, set a "tag" for the field, and also set up
|
||||
tagged versions for the field. The taggedVersions you specify should be
|
||||
open-ended-- that is, they should specify a start version, but not an end
|
||||
version.
|
||||
|
||||
You can remove support for a tagged field from a specific version of a message,
|
||||
but you can't reuse a tag once it has been used for something else. Once tags
|
||||
have been used for something, they can't be used for anything else, without
|
||||
breaking compatibilty.
|
||||
|
||||
Note that tagged fields can only be added to "flexible" message versions.
|
||||
|
||||
Flexible Versions
|
||||
-----------------
|
||||
Kafka serialization has been improved over time to be more flexible and
|
||||
efficient. Message versions that contain these improvements are referred to as
|
||||
"flexible versions."
|
||||
|
||||
In flexible verisons, variable-length fields such as strings, arrays, and bytes
|
||||
fields are serialized in a more efficient way that saves space. The new
|
||||
serialization types start with compact. For example COMPACT_STRING is a more
|
||||
efficient form of STRING.
|
||||
|
||||
Serializing Messages
|
||||
--------------------
|
||||
The Message#write method writes out a message to a buffer. The fields that are
|
||||
written out will depend on the version number that you supply to write(). When
|
||||
you write out a message using an older version, fields that are too old to be
|
||||
present in the schema will be omitted.
|
||||
|
||||
When working with older message versions, please verify that the older message
|
||||
schema includes all the data that needs to be sent. For example, it is probably
|
||||
OK to skip sending a timeout field. However, a field which radically alters the
|
||||
meaning of the request, such as a "validateOnly" boolean, should not be ignored.
|
||||
|
||||
It's often useful to know how much space a message will take up before writing
|
||||
it out to a buffer. You can find this out by calling the Message#size method.
|
||||
|
||||
You can also convert a message to a Struct by calling Message#toStruct. This
|
||||
allows you to use the functions that serialize Structs to buffers.
|
||||
|
||||
Deserializing Messages
|
||||
----------------------
|
||||
Message objects may be deserialized using the Message#read method. This method
|
||||
overwrites all the data currently in the message object with new data.
|
||||
|
||||
You can also deserialize a message from a Struct by calling Message#fromStruct.
|
||||
The Struct will not be modified.
|
||||
|
||||
Any fields in the message object that are not present in the version that you
|
||||
are deserializing will be reset to default values. Unless a custom default has
|
||||
been set:
|
||||
|
||||
* Integer fields default to 0.
|
||||
|
||||
* Floats default to 0.
|
||||
|
||||
* Booleans default to false.
|
||||
|
||||
* Strings default to the empty string.
|
||||
|
||||
* Bytes fields default to the empty byte array.
|
||||
|
||||
* Array fields default to empty.
|
||||
|
||||
You can specify "null" as a default value for a string field by specifing the
|
||||
literal string "null". Note that you can only specify null as a default if all
|
||||
versions of the field are nullable.
|
||||
|
||||
Custom Default Values
|
||||
---------------------
|
||||
You may set a custom default for fields that are integers, booleans, floats, or
|
||||
strings. Just add a "default" entry in the JSON object. The custom default
|
||||
overrides the normal default for the type. So for example, you could make a
|
||||
boolean field default to true rather than false, and so forth.
|
||||
|
||||
Note that the default must be valid for the field type. So the default for an
|
||||
int16 field must be an integer that fits in 16 bits, and so forth. You may
|
||||
specify hex or octal values, as long as they are prefixed with 0x or 0. It is
|
||||
currently not possible to specify a custom default for bytes or array fields.
|
||||
|
||||
Custom defaults are useful when an older message version lacked some
|
||||
information. For example, if an older request lacked a timeout field, you may
|
||||
want to specify that the server should assume that the timeout for such a
|
||||
request is 5000 ms (or some other arbitrary value).
|
||||
|
||||
Ignorable Fields
|
||||
----------------
|
||||
When we write messages using an older or newer format, not all fields may be
|
||||
present. The message receiver will fill in the default value for the field
|
||||
during deserialization. Therefore, if the source field was set to a non-default
|
||||
value, that information will be lost.
|
||||
|
||||
In some cases, this information loss is acceptable. For example, if a timeout
|
||||
field does not get preserved, this is not a problem. However, in other cases,
|
||||
the field is really quite important and should not be discarded. One example is
|
||||
a "verify only" boolean which changes the whole meaning of the request.
|
||||
|
||||
By default, we assume that information loss is not acceptable. The message
|
||||
serialization code will throw an exception if the ignored field is not set to
|
||||
the default value. If information loss for a field is OK, please set
|
||||
"ignorable" to true for the field to disable this behavior. When ignorable is
|
||||
set to true, the field may be silently omitted during serialization.
|
||||
|
||||
Hash Sets
|
||||
---------
|
||||
One very common pattern in Kafka is to load array elements from a message into
|
||||
a Map or Set for easier access. The message protocol makes this easier with
|
||||
the "mapKey" concept.
|
||||
|
||||
If some of the elements of an array are annotated with "mapKey": true, the
|
||||
entire array will be treated as a linked hash set rather than a list. Elements
|
||||
in this set will be accessible in O(1) time with an automatically generated
|
||||
"find" function. The order of elements in the set will still be preserved,
|
||||
however. New entries that are added to the set always show up as last in the
|
||||
ordering.
|
||||
|
||||
Incompatible Changes
|
||||
--------------------
|
||||
It's very important to avoid making incompatible changes to the message
|
||||
protocol. Here are some examples of incompatible changes:
|
||||
|
||||
#### Making changes to a protocol version which has already been released.
|
||||
Protocol versions that have been released must be regarded as done. If there
|
||||
were mistakes, they should be corrected in a new version rather than changing
|
||||
the existing version.
|
||||
|
||||
#### Re-ordering existing fields.
|
||||
It is OK to add new fields before or after existing fields. However, existing
|
||||
fields should not be re-ordered with respect to each other.
|
||||
|
||||
#### Changing the default of an existing field.
|
||||
You must never change the default of a field which already exists. Otherwise,
|
||||
new clients and old servers will not agree on the default, and so forth.
|
||||
|
||||
#### Changing the type of an existing field.
|
||||
One exception is that an array of primitives may be changed to an array of
|
||||
structures containing the same data, as long as the conversion is done
|
||||
correctly. The Kafka protocol does not do any "boxing" of structures, so an
|
||||
array of structs that contain a single int32 is the same as an array of int32s.
|
||||
Reference in New Issue
Block a user