借助百度LBS开放平台,在地图上展示北京招聘运维工程师的公司地址[1]。
数据搜集
首先需要有地址信息,本文的地址来源是拉勾网。
#!/bin/bash URL="http://www.lagou.com/jobs/list_%E8%BF%90%E7%BB%B4%E5%B7%A5%E7%A8%8B%E5%B8%88?kd=%E8%BF%90%E7%BB%B4%E5%B7%A5%E7%A8%8B%E5%B8%88&spc=1&pl=&gj=&xl=&yx=&gx=&st=&labelWords=&lc=&workAddress=&city=%E5%8C%97%E4%BA%AC&requestId=&pn=" echo >links.data for page in $(seq 1 31) do echo "$URL$page" curl -s "$URL$page" |grep jobs |grep title \ | awk -F "\"" '{print $2}' |tee -a links.data done
接下来处理成“公司名称 地址”的形式。
#!/bin/bash [ $# -lt 1 ] && exit 1 file=$1 rm -f yunwei.txt while read id do date=`date +%s` curl -s $id -o $date.html name=`sed -n "/<h1 title/,/<\/h1>/p" $date.html |grep "<div" |awk -F '[>|<]' '{print $3}'` address=`grep "var address" $date.html |awk -F "\"" '{print $2}' |awk '{print $1}' |awk -F "(" '{print $1}'` echo "$name $address" |tee -a yunwei.txt rm -f $date.html done <$file
地址编码
借助百度Geocoding API,将地址转换为经纬度坐标[2]。
<?php require_once 'function_core.php'; #实现的是curlGet函数 require_once 'config/config.php'; $ak = $config['geocoding']['ak']; $config['output'] = "json"; $config['city'] = "北京"; if(isset($_POST['address'])) { $config['address'] = $_POST['address']; } elseif(isset($argv[1])) { $config['address'] = $argv[1]; $config['address'] = urlencode($config['address']); } else { $config['address'] = urlencode("百度大厦"); } if(isset($_POST['city'])) { $city = $_POST['city']; } elseif(isset($argv[2])) { $city = $argv[2]; $city = urlencode($config['city']); } else { $city = urlencode($config['city']); } $apiUrl= "http://api.map.baidu.com/geocoder/v2/?"; $url = $apiUrl . "ak=" . $ak . "&address=" . $config['address'] . "&output=" . $config['output'] . "&city=" . $city; $geoData = curlGet($url); if(isset($_POST['shell'])) { print_r(json_decode($geoData, true)); exit(); } die($geoData);
转换成LBS数据管理后台批量导入数据要求的csv格式。
<?php require_once 'function_core.php'; $config['mapApiUrl'] = "http://dev.annhe.net/corp_map/geocoding.php"; $data = file('data.txt'); $add = array(); foreach ( $data as $k => $v) { $arr = explode(" ", $v); $add[] = array('name' => trim($arr[0]), 'address' => trim($arr[1])); } $file = fopen('geotable.csv', 'w'); fwrite($file, "title,address,longitude,latitude,coord_type\n"); for($i=0; $i<count($add); $i++) { $title = $add[$i]['name']; $address = $add[$i]['address']; $address = str_replace(' ', '', $address); $address = str_replace('#', '', $address); $data = array('address' => $address); $geodata = curlPost($data, $config['mapApiUrl']); $geodata = json_decode($geodata, true); if(is_array($geodata) && array_key_exists('result', $geodata)) { $longitude = $geodata['result']['location']['lng']; $latitude = $geodata['result']['location']['lat']; } else { echo $address . " failed\n"; continue; } $coord_type = "3"; #注意这里一定是3,表示百度加密的经纬度坐标 fwrite($file, $title . "," . $address . "," . $longitude . "," . $latitude . "," . $coord_type . "\n"); } fclose($file); //print_r($add); ?>
地图展示
lbs后台数据管理,批量上传。获取图层编号,直接修改百度官网的示例,替换ak和图层编号。map_demo
可以看到:中关村、上地,国贸比较集中。
一些问题
经纬度误差
一开始csv中的coord_type设置为1,即 GPS经纬度坐标 ,得到的地图误差很大。原来经纬度坐标按照规定是需要加密的,使用百度geocoding API获取的经纬度坐标应该是百度加密的经纬度坐标[3][4],因此coord_type应设置为3,即 百度加密经纬度坐标。
可选,
1.GPS经纬度坐标
2.国测局加密经纬度坐标
3.百度加密经纬度坐标
4.百度加密墨卡托坐标
ak保密
客户端ak无需保密,服务端ak有些需要保密,例如LBS云存储的ak,最好是单独使用,设置IP白名单。
参考资料
[1]. http://developer.baidu.com/map/index.php?title=jspopular/guide/datalayer [2]. http://developer.baidu.com/map/index.php?title=webapi/guide/webservice-geocoding [3]. http://developer.baidu.com/map/index.php?title=open/question [4]. http://developer.baidu.com/map/index.php?title=lbscloud/api/geodata
发表回复