
  • 登录
  • 忘记密码?点击找回


  • 获取手机验证码 60
  • 注册


  • 获取手机验证码60
  • 找回
毕业论文网 > 毕业论文 > 计算机类 > 计算机科学与技术 > 正文


 2021-12-21 21:43:07  


摘 要








关键字:分布式数据库 键值型存储 Raft 可选存储引擎 数据分区

Distributed Key-Value Database Based on Multiple Storage Engine


With the rapid development of mobile Internet, the current database needs to support the storage and high concurrent reading/writing of massive data, and needs to meet the high scalability and high availability. The traditional relational database has become the bottleneck of server application because of its low throughput, and the single database has poor scalability, low availability and high price of vertical expansion, which makes more and more people begin to pay attention to NoSQL database. In my design, a distributed Key-Value NoSQL database is developed, which uses Raft algorithm to ensure the consistency of data. It can choose to access different storage engines to cope with different data reading and writing scenarios. At the same time, it supports data fragmentation, replica load balancing, and dynamic capacity expansion. The whole application is implemented in go language and can be automatically deployed through docker.

The main work of my design includes:

  1. Implement the Raft algorithm module. Based on the theory of the original Raft paper, leader election, heartbeat, log replication and log compression are implemented, and RPC is used to communicate between nodes;
  2. Design and implement pluggable storage engine module. This module encapsulates the basic interface and can support a variety of storage engines. This design accesses three commonly used storage engines: LevelDB, RocksDB and BoltDB;
  3. Design and implement the module of memory partition. In this module, the data set is partitioned by algorithm, and the partition is evenly distributed to different replica groups for management;
  4. Design and implement MetaServer module to manage the replica distribution information of cluster, provide replica load balancing and dynamic capacity expansion;
  5. Design and implement client API, support command line interaction. Package the whole application as an image and use docker to automate deployment. Finally, unit test is done to the whole system to verify the functional modules of the system. At the same time, integration test and pressure test are done to verify the data consistency of the system to ensure that the system can withstand high concurrency, high availability and scalability.

Keywords: Distributed Database Key-Value Storage; Raft; Optional Storage Engine; Data Partition;

目 录

摘 要 I


第一章 引言 1

1.1研究背景和意义 1

1.2分布式数据存储的国内外研究现状 3

1.2.1 分布式一致性算法的研究现状 3

1.2.2 分布式存储模型的研究现状 3

1.3论文研究的内容 5

第二章 理论知识和关键技术 6

2.1分布式系统的数据同步 6

2.1.1 RPC(Remote Procedure Call)[26]通信 6

2.1.2 一致性算法Raft[12] 7

2.2分布式系统理论 12

2.2.1 数据分区 12

2.2.2 负载均衡 14

2.3存储引擎 15

2.3.1 B/B Tree模型 15

2.3.2 LSM-Tree模型 17

2.3.3 Hash模型 18

第三章 需求分析与系统设计 19

3.1 系统需求分析 19

3.1.1 功能需求 19

3.1.2 性能需求 20

3.2 系统整体架构设计 21

3.3 数据同步模块设计 22

3.3.1 Leader选举与心跳 23

3.3.2 日志复制 24

3.3.3 日志压缩 25

3.4 存储引擎模块设计 26

3.5 MetaServer模块设计 27

3.5.1 水平伸缩 27

3.5.2 负载均衡算法 28

3.6 分片存储模块设计 28

3.6.1 分区算法 29

3.6.2 数据读写 30

3.6.3 副本迁移 31

3.4 客户端交互模块设计 32

第四章 系统实现 34

4.1 数据同步模块实现 35

4.1.1 Leader选举 35

4.1.2 日志复制 37

4.1.3 日志压缩 39

4.2 存储引擎模块实现 40

4.2.1 抽象存储引擎层 40

4.2.2 接入存储引擎 40

4.3 MetaServer模块实现 41

4.4 分片存储模块实现 43

4.4.1 数据读写 43

4.4.2 副本迁移 45

4.5 客户端交互模块实现 46

第五章 系统部署与测试 48

5.1 系统环境 48

5.2 Docker自动化部署 49

5.3 系统测试 50

5.3.1 数据读写功能测试 50

5.3.2 数据一致性测试 51

5.3.3 系统水平扩展测试 52

第六章 总结与展望 54

6.1 全文总结 54

6.2 课题展望 54

参考文献 56

致谢 59

第一章 引言


在19世纪60年代,流行网状数据库和层次数据库,但开发者使用这两种数据库进行数据存储的时候,需要明确数据的结构和路径,这使得检索信息非常复杂且耗时。1970年,IBM的Edgar F.Codd发布了一种便捷管理数据的方案,首次提出了数据库的关系模型理论[1],在接下来的几年,他又提出一系列关系型数据库的数学理论。基于这些数学理论,Ray Boyce等人使用简单的关键字语法,设计出了一种通用的关系型数据库语言SQL。SQL语言使得开发者能便捷地操控关系数据库。Larry Ellison意识到Codd提出的关系数据库能全面管理错综复杂的数据信息,非常有商业价值,于是他带领研发团队开发了Oracle数据库[2],取得了巨大成功。



您需要先支付 50元 才能查看全部内容!立即支付


Copyright © 2010-2022 毕业论文网 站点地图