您现在的位置是：首页 > 文章详情

LevelDB源码分析-编码

日期：2019-04-05点击：561收藏

编码（util/coding.h util/coding.cc）

LevelDB将整型编码为二进制字符串的形式，同时又能够和ASCII字符区分。

首先是定长编码：

void EncodeFixed32(char *buf, uint32_t value) { if (port::kLittleEndian) { memcpy(buf, &value, sizeof(value)); } else { buf[0] = value & 0xff; buf[1] = (value >> 8) & 0xff; buf[2] = (value >> 16) & 0xff; buf[3] = (value >> 24) & 0xff; } } void EncodeFixed64(char *buf, uint64_t value) { if (port::kLittleEndian) { memcpy(buf, &value, sizeof(value)); } else { buf[0] = value & 0xff; buf[1] = (value >> 8) & 0xff; buf[2] = (value >> 16) & 0xff; buf[3] = (value >> 24) & 0xff; buf[4] = (value >> 32) & 0xff; buf[5] = (value >> 40) & 0xff; buf[6] = (value >> 48) & 0xff; buf[7] = (value >> 56) & 0xff; } }

这里根据机器区分大端和小端，LevelDB编码后的字符串为小端存储。在编码时，只是简单的将8位二进制码存储在一个char字符的位置上。因为定长，所以可以和ASCII字符区分。

接下来是定长编码的一些接口函数：

void PutFixed32(std::string *dst, uint32_t value) // 将一个32位整型值定长编码后存入dst void PutFixed64(std::string *dst, uint64_t value) // 将一个64位整型值定长编码后存入dst

然后是变长编码：

char *EncodeVarint32(char *dst, uint32_t v) { // Operate on characters as unsigneds unsigned char *ptr = reinterpret_cast<unsigned char *>(dst); static const int B = 128; if (v < (1 << 7)) { *(ptr++) = v; } else if (v < (1 << 14)) { *(ptr++) = v | B; *(ptr++) = v >> 7; } else if (v < (1 << 21)) { *(ptr++) = v | B; *(ptr++) = (v >> 7) | B; *(ptr++) = v >> 14; } else if (v < (1 << 28)) { *(ptr++) = v | B; *(ptr++) = (v >> 7) | B; *(ptr++) = (v >> 14) | B; *(ptr++) = v >> 21; } else { *(ptr++) = v | B; *(ptr++) = (v >> 7) | B; *(ptr++) = (v >> 14) | B; *(ptr++) = (v >> 21) | B; *(ptr++) = v >> 28; } return reinterpret_cast<char *>(ptr); } char *EncodeVarint64(char *dst, uint64_t v) { static const int B = 128; unsigned char *ptr = reinterpret_cast<unsigned char *>(dst); while (v >= B) { *(ptr++) = (v & (B - 1)) | B; v >>= 7; } *(ptr++) = static_cast<unsigned char>(v); return reinterpret_cast<char *>(ptr); }

LevelDB的变长编码设计的十分巧妙，它以7个二进制bit为一个单位，存入一个char中，同时为了和ASCII码进行区分，将char的最高位设为1（ASCII码为0-127），同样采用小端存储的形式。但是变长编码的最后一个char的最高位是0，以此作为变长编码后的字符串的结束标志。

例如，11001011101111001会被编码为11111001 10101110 00000110。

接下来是一些变长编码的接口函数：

void PutVarint32(std::string *dst, uint32_t v) // 将一个32位整型值变长编码后存入dst void PutVarint64(std::string *dst, uint64_t v) // 将一个64位整型值变长编码后存入dst int VarintLength(uint64_t v) // 获取变长编码后的字符串长度(以字节计数) const char *GetVarint32PtrFallback(const char *p, const char *limit, uint32_t *value) // 将以p到limit之间的变长编码字符串解码为32位整型值 bool GetVarint32(Slice *input, uint32_t *value) // 将以p到limit之间的变长编码字符串解码为32位整型值并封装入Slice中 const char *GetVarint64Ptr(const char *p, const char *limit, uint64_t *value) // 将以p到limit之间的变长编码字符串解码为64位整型值 bool GetVarint64(Slice *input, uint64_t *value) // 将以p到limit之间的变长编码字符串解码为64位整型值并封装入Slice中

227 Love u

原文链接：https://my.oschina.net/yunanlong/blog/3032907

关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有，本站原创内容转载请注明来源。