LevelDB源码分析-Get-低调大师

LevelDB源码分析-Get

2019-04-05 706

Get

LevelDB提供了Get接口用于给定key的查找：

Status DBImpl::Get(const ReadOptions &options,
                   const Slice &key,
                   std::string *value)

Get操作可以指定在某个snapshot的情况下进行，如果指定了snapshot，则获取该snapshot的sequencenumber，如果没有指定snapshot，就取当前最新的sequencenumber：

    Status s;
    MutexLock l(&mutex_);
    SequenceNumber snapshot;
    if (options.snapshot != nullptr)
    {
        snapshot =
            static_cast<const SnapshotImpl *>(options.snapshot)->sequence_number();
    }
    else
    {
        snapshot = versions_->LastSequence();
    }

    MemTable *mem = mem_;
    MemTable *imm = imm_;
    Version *current = versions_->current();
    mem->Ref();
    if (imm != nullptr)
        imm->Ref();
    current->Ref();

首先在memtable里找，如果找到了就结束查找，然后再immutable memtable里找（如果immutable memtable存在），如果找到了就结束查找，在这两个地方查找使用的都是MemTable类提供的Get接口函数（在这里有分析LevelDB源码分析-MemTable。最后使用Version类提供的Get接口函数在sstable中查找：

    // Unlock while reading from files and memtables
    {
        mutex_.Unlock();
        // First look in the memtable, then in the immutable memtable (if any).
        LookupKey lkey(key, snapshot);
        if (mem->Get(lkey, value, &s))
        {
            // Done
        }
        else if (imm != nullptr && imm->Get(lkey, value, &s))
        {
            // Done
        }
        else
        {
            s = current->Get(options, lkey, value, &stats);
            have_stat_update = true;
        }
        mutex_.Lock();
    }

如果在sstable中查找了，会更新查找涉及到的sstable的seek次数，可能会触发compact条件，因此需要调用MaybeScheduleCompaction函数进行可能的compact操作（在这里有分析https://www.cnblogs.com/YuNanlong/p/9440548.html）：

    if (have_stat_update && current->UpdateStats(stats))
    {
        MaybeScheduleCompaction();
    }
    mem->Unref();
    if (imm != nullptr)
        imm->Unref();
    current->Unref();
    return s;

接下来分析Version类封装的Get函数：

Status Version::Get(const ReadOptions &options,
                    const LookupKey &k,
                    std::string *value,
                    GetStats *stats)

首先是一些变量必要的初始化：

    Slice ikey = k.internal_key();
    Slice user_key = k.user_key();
    const Comparator *ucmp = vset_->icmp_.user_comparator();
    Status s;

    stats->seek_file = nullptr;
    stats->seek_file_level = -1;
    FileMetaData *last_file_read = nullptr;
    int last_file_read_level = -1;

    // We can search level-by-level since entries never hop across
    // levels.  Therefore we are guaranteed that if we find data
    // in an smaller level, later levels are irrelevant.
    std::vector<FileMetaData *> tmp;
    FileMetaData *tmp2;

在每一层中搜索：

    for (int level = 0; level < config::kNumLevels; level++)
    {

如果该level没有文件则直接跳过：

        size_t num_files = files_[level].size();
        if (num_files == 0)
            continue;

如果当前位于level0，将所有可能包含key的文件都加入files中：

        // Get the list of files to search in this level
        FileMetaData *const *files = &files_[level][0];
        if (level == 0)
        {
            // Level-0 files may overlap each other.  Find all files that
            // overlap user_key and process them in order from newest to oldest.
            tmp.reserve(num_files);
            for (uint32_t i = 0; i < num_files; i++)
            {
                FileMetaData *f = files[i];
                if (ucmp->Compare(user_key, f->smallest.user_key()) >= 0 &&
                    ucmp->Compare(user_key, f->largest.user_key()) <= 0)
                {
                    tmp.push_back(f);
                }
            }
            if (tmp.empty())
                continue;

            std::sort(tmp.begin(), tmp.end(), NewestFirst);
            files = &tmp[0];
            num_files = tmp.size();
        }

如果当前不是level0，则调用FindFile进行二分查找，找到file后验证要找的key是不是在file中，如果是，加入files：

        else
        {
            // Binary search to find earliest index whose largest key >= ikey.
            uint32_t index = FindFile(vset_->icmp_, files_[level], ikey);
            if (index >= num_files)
            {
                files = nullptr;
                num_files = 0;
            }
            else
            {
                tmp2 = files[index];
                if (ucmp->Compare(user_key, tmp2->smallest.user_key()) < 0)
                {
                    // All of "tmp2" is past any data for user_key
                    files = nullptr;
                    num_files = 0;
                }
                else
                {
                    files = &tmp2;
                    num_files = 1;
                }
            }
        }

遍历找到的files，如果seek的文件不止一个，则记录下第一个seek的文件，之后要将这个文件的seek减一（调用UpdateStats函数）：

        for (uint32_t i = 0; i < num_files; ++i)
        {
            if (last_file_read != nullptr && stats->seek_file == nullptr)
            {
                // We have had more than one seek for this read.  Charge the 1st file.
                stats->seek_file = last_file_read;
                stats->seek_file_level = last_file_read_level;
            }

            FileMetaData *f = files[i];
            last_file_read = f;
            last_file_read_level = level;

调用table_cache_->Get函数在文件中搜索key值，如果没有找到，则继续搜索下一个file，如果找到了，不论是删除的还是过期的，都返回（因为之后就算找到了key，也比现在的key旧，被现在的key覆盖）：

            Saver saver;
            saver.state = kNotFound;
            saver.ucmp = ucmp;
            saver.user_key = user_key;
            saver.value = value;
            s = vset_->table_cache_->Get(options, f->number, f->file_size,
                                         ikey, &saver, SaveValue);
            if (!s.ok())
            {
                return s;
            }
            switch (saver.state)
            {
            case kNotFound:
                break; // Keep searching in other files
            case kFound:
                return s;
            case kDeleted:
                s = Status::NotFound(Slice()); // Use empty error message for speed
                return s;
            case kCorrupt:
                s = Status::Corruption("corrupted key for ", user_key);
                return s;
            }
        }

230 Love u

微信关注我们

原文链接：https://my.oschina.net/yunanlong/blog/3032918

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

LevelDB源码分析-Compact

Compaction compact由背景线程完成，代码中触发背景线程的函数为： void DBImpl::MaybeScheduleCompaction() { mutex_.AssertHeld(); if (background_compaction_scheduled_) { // Already scheduled } else if (shutting_down_.Acquire_Load()) { // DB is being deleted; no more background compactions } else if (!bg_error_.ok()) { // Already got an error; no more changes } else if (imm_ == nullptr && manual_compaction_ == nullptr && !versions_->NeedsCompaction()) { // No work to be done } else { background_compactio...

2019-04-05

775

JShell 是在 JDK 9 中首次引入的，以Kulla 实现的Java Enhancement Proposal (JEP) 222 规范的一部分。很多编程语言如JavaScript、Python、Ruby 等，提供了非常易用的命令行执行工具，但 Java 一直缺失此功能。因此 JDK 9 引入了 Java shell 工具 —— JShell。在之前的文章中我们曾经讨论了 JShell 的一些基础知识，这篇文字中我们主要聊一些高级的概念。 1. 变量重新定义在 Java 中，我们是没法对一个变量进行重新声明的。但是有了 JShell，我们可以随时在需要的时候对一个变量重新进行定义，包括原生类型以及引用类型变量。示例: jshell> String str="Hello" str ==> "JShell" jshell> Integer str=10 str ==> 10 2. 临时变量 (Scratch Variables) 在 JShell 命令行中可以将任意表达式计算的结果赋值给变量，尽管你并没有显式的赋值。这样的变量称为临时变量。例如： j...

2019-04-06

711

资源下载

更多资源

优质分享App

近一个月的开发和优化，本站点的第一个app全新上线。该app采用极致压缩，本体才4.36MB。系统里面做了大量数据访问、缓存优化。方便用户在手机上查看文章。后续会推出HarmonyOS的适配版本。

腾讯云软件源

为解决软件依赖安装时官方源访问速度慢的问题，腾讯云为一些软件搭建了缓存服务。您可以通过使用腾讯云软件源站来提升依赖包的安装速度。为了方便用户自由搭建服务架构，目前腾讯云软件源站支持公网访问和内网访问。

Nacos

Nacos /nɑ:kəʊs/ 是 Dynamic Naming and Configuration Service 的首字母简称，一个易于构建 AI Agent 应用的动态服务发现、配置管理和AI智能体管理平台。Nacos 致力于帮助您发现、配置和管理微服务及AI智能体应用。Nacos 提供了一组简单易用的特性集，帮助您快速实现动态服务发现、服务配置、服务元数据、流量管理。Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。