当前位置: 首页 > news >正文

南阳建设网站招聘怎样建立自己的网站平台

南阳建设网站招聘,怎样建立自己的网站平台,互联网网站建设公司,wordpress用腾讯云cdn前言 Apache Spark是专门为大规模数据处理而设计的快速通用的计算引擎,Spark拥有Hadoop MapReduce所具有的优点,但不同于Mapreduce的是Job中间输出结果可以保存在内存中,从而不再需要读写HDFS,因此Spark能更好的适用于数据挖掘与…

前言

Apache Spark是专门为大规模数据处理而设计的快速通用的计算引擎,Spark拥有Hadoop MapReduce所具有的优点,但不同于Mapreduce的是Job中间输出结果可以保存在内存中,从而不再需要读写HDFS,因此Spark能更好的适用于数据挖掘与机器学习等需要迭代的MapReduce。

Spark是一种与hadoop相似的开源集群计算环境,但是两者之间还存在一些不同之处,Spark启用了内存分布数据集群,除了能够提供交互式查询外,它还可以优化迭代工作负载。

Spark特点:
1、更快的速度:内存计算下,Spark比Hadoop快100倍
2、易用性:可以使用java、scala、python、R和SQL语言进行spark开发,Spark提供了80多个高级运算符
3、通用性:Spark提供了大量的库,包括Spark Core、Spark SQL、Spark Streaming、MLlib、GraphX。开发者可以在一个应用程序中无缝组合使用这些库
4、多种运行环境:Spark可在Hadoop、Apache Mesos、Kubernetes、standalone或其他云环境上运行

参考链接:https://blog.csdn.net/cuiyaonan2000/article/details/116048663

spark适用场景:
1、spark是基于内存的迭代计算,适合多次操作特定数据集的场合
2、数据量不是特别大,但要求实时统计分析需求
3、不适用异步细粒更新状态的应用,如web服务器存储、增量的web爬虫和索引

spark运行模式:
1、可以运行在一台机器上,称为Local(本地)运行模式
2、可以使用spark自带的资源调度系统,称为Standalone模式
3、可以使用Yarn、Mesos、kubernetes作为底层资源调度系统,称为Spark On Yarn、Spark On Mesos、Spark On K8s

参考链接:https://blog.csdn.net/jiayi_yao/article/details/125545826#t8

一、安装启动

安装spark及其依赖
yum install java-1.8.0-openjdk curl tar python3
mkdir -p /usr/local/spark
cd /usr/local/spark
wget https://mirrors.aliyun.com/apache/spark/spark-3.3.2/spark-3.3.2-bin-hadoop3.tgz
tar -xvf spark-3.3.2-bin-hadoop3.tgz

  • 启动spark-master
    cd /usr/local/spark/spark-3.3.2-bin-hadoop3/
    [root@bogon spark-3.3.2-bin-hadoop3]# ./sbin/start-master.sh
    可以看到类似如下的输出:
    starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/spark-3.3.2-bin-hadoop3/logs/spark-root-org.apache.spark.deploy.master.Master-1-bogon.out
    用tail命令查看执行日志
[root@bogon spark-3.3.2-bin-hadoop3]# tail logs/spark-root-org.apache.spark.deploy.master.Master-1-bogon.out 
23/03/06 14:35:44 INFO SecurityManager: Changing modify acls to: root
23/03/06 14:35:44 INFO SecurityManager: Changing view acls groups to: 
23/03/06 14:35:44 INFO SecurityManager: Changing modify acls groups to: 
23/03/06 14:35:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
23/03/06 14:35:45 INFO Utils: Successfully started service 'sparkMaster' on port 7077.
23/03/06 14:35:45 INFO Master: Starting Spark master at spark://bogon:7077
23/03/06 14:35:45 INFO Master: Running Spark version 3.3.2
23/03/06 14:35:45 INFO Utils: Successfully started service 'MasterUI' on port 8080.
23/03/06 14:35:45 INFO MasterWebUI: Bound MasterWebUI to 0.0.0.0, and started at http://bogon:8080
23/03/06 14:35:46 INFO Master: I have been elected leader! New state: ALIVE
  • 启动spark-worker
    cd /usr/local/spark/spark-3.3.2-bin-hadoop3/
    [root@bogon spark-3.3.2-bin-hadoop3]# ./sbin/start-worker.sh spark://bogon:7077
    可以看到类似如下的输出:
    starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/spark-3.3.2-bin-hadoop3/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bogon.out
    用tail命令查看执行日志
[root@bogon spark-3.3.2-bin-hadoop3]# tail logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bogon.out 
23/03/06 14:52:17 INFO Worker: Spark home: /usr/local/spark/spark-3.3.2-bin-hadoop3
23/03/06 14:52:17 INFO ResourceUtils: ==============================================================
23/03/06 14:52:17 INFO ResourceUtils: No custom resources configured for spark.worker.
23/03/06 14:52:17 INFO ResourceUtils: ==============================================================
23/03/06 14:52:17 WARN Utils: Service 'WorkerUI' could not bind on port 8081. Attempting port 8082.
23/03/06 14:52:17 INFO Utils: Successfully started service 'WorkerUI' on port 8082.
23/03/06 14:52:17 INFO WorkerWebUI: Bound WorkerWebUI to 0.0.0.0, and started at http://bogon:8082
23/03/06 14:52:17 INFO Worker: Connecting to master bogon:7077...
23/03/06 14:52:17 INFO TransportClientFactory: Successfully created connection to bogon/10.130.0.73:7077 after 85 ms (0 ms spent in bootstraps)
23/03/06 14:52:18 INFO Worker: Successfully registered with master spark://bogon:7077

二、测试

  • 通过http://$IP:8080 访问master页面,获取资源消耗等的摘要信息
    如果部署成功,我们可以看到类似如下的返回结果
    spark-master
  • 也可以通过web浏览器请求对应服务器8082端口(默认8081端口,可通过日志查看具体的端口号),查看worker的基本情况
    spark-worker
  • 提交测试任务,计算pi,提交命令如下
    ./bin/spark-submit --master spark://bogon:7077 examples/src/main/python/pi.py 1000
    可以看到类似如下的输出:
 23/03/06 16:32:55 INFO TaskSetManager: Starting task 999.0 in stage 0.0 (TID 999) (10.130.0.73, executor 0, partition 999, PROCESS_LOCAL, 4437 bytes) taskResourceAssignments Map()
23/03/06 16:32:55 INFO TaskSetManager: Finished task 995.0 in stage 0.0 (TID 995) in 217 ms on 10.130.0.73 (executor 0) (996/1000)
23/03/06 16:32:55 INFO TaskSetManager: Finished task 996.0 in stage 0.0 (TID 996) in 220 ms on 10.130.0.73 (executor 0) (997/1000)
23/03/06 16:32:55 INFO TaskSetManager: Finished task 997.0 in stage 0.0 (TID 997) in 198 ms on 10.130.0.73 (executor 0) (998/1000)
23/03/06 16:32:55 INFO TaskSetManager: Finished task 998.0 in stage 0.0 (TID 998) in 189 ms on 10.130.0.73 (executor 0) (999/1000)
23/03/06 16:32:55 INFO TaskSetManager: Finished task 999.0 in stage 0.0 (TID 999) in 238 ms on 10.130.0.73 (executor 0) (1000/1000)
23/03/06 16:32:55 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
23/03/06 16:32:55 INFO DAGScheduler: ResultStage 0 (reduce at /usr/local/spark/spark-3.3.2-bin-hadoop3/examples/src/main/python/pi.py:42) finished in 64.352 s
23/03/06 16:32:55 INFO DAGScheduler: Job 0 is finished. Cancelling potential speculative or zombie tasks for this job
23/03/06 16:32:55 INFO TaskSchedulerImpl: Killing all running tasks in stage 0: Stage finished
23/03/06 16:32:55 INFO DAGScheduler: Job 0 finished: reduce at /usr/local/spark/spark-3.3.2-bin-hadoop3/examples/src/main/python/pi.py:42, took 64.938378 s
Pi is roughly 3.133640
23/03/06 16:32:55 INFO SparkUI: Stopped Spark web UI at http://bogon:4040
23/03/06 16:32:55 INFO StandaloneSchedulerBackend: Shutting down all executors
23/03/06 16:32:55 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down
23/03/06 16:32:55 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/03/06 16:32:55 INFO MemoryStore: MemoryStore cleared
23/03/06 16:32:55 INFO BlockManager: BlockManager stopped
23/03/06 16:32:55 INFO BlockManagerMaster: BlockManagerMaster stopped
23/03/06 16:32:55 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/03/06 16:32:55 INFO SparkContext: Successfully stopped SparkContext
23/03/06 16:32:56 INFO ShutdownHookManager: Shutdown hook called
23/03/06 16:32:56 INFO ShutdownHookManager: Deleting directory /tmp/spark-087c8a21-8641-4b20-8b65-be47b77f26c5
23/03/06 16:32:56 INFO ShutdownHookManager: Deleting directory /tmp/spark-24d1bc1a-841a-435e-8263-ad891e2aaa97/pyspark-f67ad2cb-ec86-41eb-ae7c-8fcb46e66827
23/03/06 16:32:56 INFO ShutdownHookManager: Deleting directory /tmp/spark-24d1bc1a-841a-435e-8263-ad891e2aaa97

可以在网页界面看到运行结果

也可以查看执行日志

[root@bogon spark-3.3.2-bin-hadoop3]# tail logs/spark-root-org.apache.spark.deploy.worker.Worker-1-bogon.out 
23/03/06 16:31:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
23/03/06 16:31:48 INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.312.b07-8.1.10.lns8.loongarch64/jre/bin/java" "-cp" "/usr/local/spark/spark-3.3.2-bin-hadoop3/conf/:/usr/local/spark/spark-3.3.2-bin-hadoop3/jars/*" "-Xmx1024M" "-Dspark.driver.port=46821" "-XX:+IgnoreUnrecognizedVMOptions" "--add-opens=java.base/java.lang=ALL-UNNAMED" "--add-opens=java.base/java.lang.invoke=ALL-UNNAMED" "--add-opens=java.base/java.lang.reflect=ALL-UNNAMED" "--add-opens=java.base/java.io=ALL-UNNAMED" "--add-opens=java.base/java.net=ALL-UNNAMED" "--add-opens=java.base/java.nio=ALL-UNNAMED" "--add-opens=java.base/java.util=ALL-UNNAMED" "--add-opens=java.base/java.util.concurrent=ALL-UNNAMED" "--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED" "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED" "--add-opens=java.base/sun.nio.cs=ALL-UNNAMED" "--add-opens=java.base/sun.security.action=ALL-UNNAMED" "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED" "--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler@bogon:46821" "--executor-id" "0" "--hostname" "10.130.0.73" "--cores" "4" "--app-id" "app-20230306163148-0000" "--worker-url" "spark://Worker@10.130.0.73:35787"
23/03/06 16:32:55 INFO Worker: Asked to kill executor app-20230306163148-0000/0
23/03/06 16:32:55 INFO ExecutorRunner: Runner thread for executor app-20230306163148-0000/0 interrupted
23/03/06 16:32:55 INFO ExecutorRunner: Killing process!
23/03/06 16:32:55 INFO Worker: Executor app-20230306163148-0000/0 finished with state KILLED exitStatus 143
23/03/06 16:32:55 INFO ExternalShuffleBlockResolver: Clean up non-shuffle and non-RDD files associated with the finished executor 0
23/03/06 16:32:55 INFO ExternalShuffleBlockResolver: Executor is not registered (appId=app-20230306163148-0000, execId=0)
23/03/06 16:32:55 INFO Worker: Cleaning up local directories for application app-20230306163148-0000
23/03/06 16:32:55 INFO ExternalShuffleBlockResolver: Application app-20230306163148-0000 removed, cleanupLocalDirs = true

清理环境

./sbin/stop-worker.sh
./sbin/stop-master.sh
rm -rf /usr/local/spark

http://www.dt0577.cn/news/18676.html

相关文章:

  • 宝安多屏网站建设公司好吗网站优化外包费用
  • 建零售网站还是海口网站关键词优化
  • 网站建设在哪里做比较好如何做广告宣传与推广
  • 做蛋糕有哪些网站seo赚钱培训
  • 衡量网站质量的标准淄博搜索引擎优化
  • dedecms做图库网站广东公司搜索seo哪家强
  • 深圳市创想三维科技有限公司山东搜索引擎优化
  • 建网站的要求社交网络推广方法
  • 网站加qq客服产品的推广及宣传思路
  • wordpress怎样显示子类目临沧seo
  • 做ic销售的各种网站seo排名点击
  • 做旅游网站会遇到什么问题百度推广登录平台客服
  • 一级域名网站搜狗官网
  • 论坛网站免费建设模板下载安装营销策划公司名称
  • 什么摄像头做直播网站代运营公司怎么找客户
  • 做独立网站需要注意什么手续网站的seo方案
  • 网站建设 推广 公司西安网站推广
  • 成都有没有做网站建设的最好的seo外包
  • 网站备案信息如何注销吗郑州最新通告
  • 河南省住房和城乡建设厅网站营销网站建设规划
  • 杭州网站模板国内网站排名
  • 山西省和城乡建设厅网站百度网盘破解版
  • 克隆网站首页做单页站几个文件广告传媒公司
  • 网站建设公众号开培训心得体会总结简短
  • 如何进行网站建设分析seo优化专家
  • 做的比较好的返利网站知乎太原做网站的工作室
  • 中国做健身补剂的网站百度竞价开户需要多少钱
  • 体验好的网站app注册拉新平台
  • 给蛋糕店做企业网站的文案杭州余杭区抖音seo质量高
  • 支付宝签约网站百度游戏官网