登录

  • 登录
  • 忘记密码?点击找回

注册

  • 获取手机验证码 60
  • 注册

找回密码

  • 获取手机验证码60
  • 找回
毕业论文网 > 毕业论文 > 计算机类 > 软件工程 > 正文

社交网站数据挖掘实用程序开发毕业论文

 2021-05-06 14:26:41  

摘 要

随着社交网站的发展和大数据时代的到来,人们对社交网站尤其是Twitter类等基于短文本的社交网络媒体越发青睐,其中的商业价值也越来越明显。通过对近年来越来越受到人们关注的Twitter等社交网站的数据挖掘,对推文文本进行有效地分析和处理,挖掘出海量的推文数据中的用户兴趣及其关注的领域,不仅对企业及广大商家来说具有巨大的利用价值,同时智能化向用户推荐其感兴趣的数据也能够给用户提供方便。推文数据长度一般限制在140单词左右,其所蕴含的信息量相对比较少,但总量又比较大。通过相应的工具可以对其进行统计分析。本文使用Twitter API获取Twitter网站上的推文,获取用户状态及好友信息并以图表形式统计出来。在整个分析中,对用户的基本操作作出了模拟,以开发者的模式获取了用户的数据,并进行统计学分析,使之具有一定的科研价值。

关键词:社交网络; 数据挖掘;Twitter

Abstract

With the advent of social networking sites and the development of big data era, people on social networking sites, especially Twitter etc. short text-based social networking media are increasingly popular, which commercial value has become increasingly evident. Through the recent years more and more people concerned the social networking sites such as Twitter about data mining, to push texts effectively analyze and process, to dig out their areas of concern. Tweets massive data in the interest, not only for business and the majority of merchants has great useful value, while its intelligent recommendation data to interested users can also provide convenience to the users. Tweets data length is generally limited to 140 words or so, it contains a relatively small amount of information, but the total is relatively large. It can be analyzed by appropriate statistical tools. As used herein, Twitter API get Twitter tweets on the site to get information of user status and Friends and graphically counted. Throughout the analysis, made simulations the basic operation of the user , obtain the user's data as the developer , and statistical analysis, make it some scientific value.

KeyWords:social networking sites;data mining;Twitter

目 录

  1. 绪论..................................................................01

1.1 数据挖掘.............................................................01

1.1.1 数据挖掘的产生和技术背景..........................................01

1.1.2 数据挖掘的原理和算法..............................................01

1.2 社交网站数据挖掘.....................................................03

  1. 关键技术..............................................................04

2.1 python................................................................04

2.1.1 python介绍 ........................................................04

2.1.2 python应用范围....................................................04

2.1.3 python应用库......................................................06

2.2 Twitter API...........................................................06

2.2.1什么是Twitter API.................................................06

2.2.2 python连接Twitter API.............................................06

2.3 MongoDB数据库........................................................06

2.3.1 MongoDB数据库介绍…...............................................06

2.3.2 python连接MongoDB数据库..........................................08

2.4 JSON..................................................................08

2.4.1 JSON 技术..........................................................08

2.4.2 JSON 数据结构......................................................08

2.4.3 JSON 优缺点........................................................09

  1. 数据挖掘软件的设计....................................................10

3.1 数据挖掘软件的需求....................................................10

3.2 软件总体结构设计......................................................10

3.3 软件的模块设计........................................................11

3.3.1 主模块分解........................................................11

3.3.2 各模块联系........................................................12

3.3.3 各模块组成详解....................................................12

3.4 软件的开发环境以及运行环境............................................13

3.4.1 开发工具..........................................................13

3.4.2 运行环境..........................................................14

  1. 功能模块实现方案......................................................15

4.1 用户登录..............................................................15

4.2 功能选择..............................................................15

4.2.1 实现流行话题探索..................................................15

4.2.2 实现查找推文......................................................15

4.2.3 实现推文数据存储..................................................16

4.2.4 实现采集时序数据..................................................17

4.2.5 实现提取推文实体..................................................17

4.2.6 实现特定范围内查找最流行推文......................................18

4.2.7 实现对频率分析制表................................................19

4.2.8 实现转推相关分析..................................................19

4.2.9 实现用户个人信息收集..............................................20

4.2.10 实现分析用户的好友及收藏的推文...................................20

4.3 软件运行结果..........................................................24

4.3.1探索流行话题topic_trends运行结果...................................24

4.3.2查找最流行推文find_popular_tweets运行结果..........................26

4.3.3图表格式显示频率统计结果...........................................27

4.3.4析用户收藏的推文 analyze_favorite运行结果..........................28

  1. 结束语................................................................29

5.1 总结..................................................................29

5.2 不足与展望............................................................29

参考文献...................................................................30

您需要先支付 50元 才能查看全部内容!立即支付

微信号:bysjorg

Copyright © 2010-2022 毕业论文网 站点地图