• Tidak ada hasil yang ditemukan

Optimizing the Import

Leaving all of this data in the flat files won’t be very efficient for creating a map from the data, since it will take minutes each time to parse the files and will likely flood all the memory buffers on your server and your visitors’ machines. Therefore, you’ll import the data points into a SQL data structure so that you can selectively plot the information based on your visitors’ interests (as described in the next two chapters).

Caution

We assume you are already familiar with MySQL and have an administration tool for your database that you are skilled at using. If you’re not familiar with MySQL, we recommend Beginning PHP and MySQL 5: From Novice to Professional, Second Edition, by W. Jason Gilmore (http://www.apress.com/

book/bookDisplay.html?bID=10017).

You’ll be storing the information from each of your data files in its own table. While the data you are interested in has a 1:1:1 relationship among the three files, the reason for doing this is threefold:

• Reading in the contents of each file into a gigantic array and then inserting the data into a single unified table one record at a time would consume hundreds of megabytes of memory. Since the default PHP per-script memory limit is 8MB, and most web hosts don’t increase this limit, this isn’t a workable solution in general. We also assume you do not have sufficient permissions at your web host to increase your own memory limits. If you do control your own server, feel free to use this method if you prefer, as there are no real drawbacks other than the one-time memory consumption issue.

• Opening the three files simultaneously and sequentially reassembling the corresponding records would require that the files be sorted first. (The FCC explicitly states that it will never sort the files before you download them.) Doing this in PHP would again exceed the memory limits, and using the Unix sortfile system utility requires the use of PHP’s exec(), which is also a protected function on many web hosts.

• Using a SQL INSERTstatement for the data in the RA.datfile, then using an UPDATEstate- ment to fill in the blanks when you later read in EN.datand CO.dat. would require heavy use of the MySQL UPDATEfeature, which is an order of magnitude (ten times) slower than using INSERT. We tried this method, and it took more than eight hours to import all of the data. Listing 5-3 only takes a few minutes.

C H A P T E R 5■ M A N I P U L AT I N G T H I R D - PA RT Y D ATA 102

The structure we’ve chosen for the three-table design is in Listing 5-2. Copy these statements into your administration tool and execute them.

Listing 5-2. The MySQL Table Creation Statements for the Example CREATE TABLE fcc_location (

loc_id int(10) unsigned NOT NULL auto_increment, unique_si_loc bigint(20) NOT NULL default '0', lat_deg int(11) default '0',

lat_min int(11) default '0', lat_sec float default '0', lat_dir char(1) default NULL, latitude double default '0', long_deg int(11) default '0', long_min int(11) default '0', long_sec float default '0', long_dir char(1) default NULL, longitude double default '0', PRIMARY KEY (loc_id), KEY unique_si (unique_si_loc) ) ENGINE=MyISAM ;

CREATE TABLE fcc_owner (

owner_id int(10) unsigned NOT NULL auto_increment, unique_si_own bigint(20) NOT NULL default '0', owner_name varchar(200) default NULL,

owner_address varchar(35) default NULL, owner_city varchar(20) default NULL, owner_state char(2) default NULL, owner_zip varchar(10) default NULL, PRIMARY KEY (owner_id),

KEY unique_si (unique_si_own) ) ENGINE=MyISAM ;

CREATE TABLE fcc_structure (

struc_id int(10) unsigned NOT NULL auto_increment, unique_si bigint(20) NOT NULL default '0',

date_constr date default '0000-00-00', date_removed date default '0000-00-00', struc_address varchar(80) default NULL, struc_city varchar(20) default NULL, struc_state char(2) default NULL, struc_height double default '0',

struc_elevation double NOT NULL default '0', struc_ohag double NOT NULL default '0', struc_ohamsl double default '0', struc_type varchar(6) default NULL, PRIMARY KEY (struc_id),

KEY unique_si (unique_si), KEY struc_state (struc_state) ) ENGINE=MyISAM;

After you create the tables, run Listing 5-3 from either a browser or the command line to import the data. Importing the data could take up to ten minutes, so be patient.

Listing 5-3. FCC ASR Conversion to SQL Data Structures

<?php

set_time_limit(0); // this could take a while // Connect to the database

require($_SERVER['DOCUMENT_ROOT'] . '/db_credentials.php');

$conn = mysql_connect("localhost", $db_name, $db_pass);

mysql_select_db("googlemapsbook", $conn);

// Open the Physical Location Coordinates file

$handle = @fopen("RA.dat","r");

if ($handle) {

while (!feof($handle)) {

$buffer = fgets($handle, 4096);

$row = explode("|",$buffer);

if ($row[3] > 0) {

// Modify things before we insert them

$row[12] = date("Y-m-d",strtotime($row[12]));

$row[13] = date("Y-m-d",strtotime($row[13]));

$row[23] = addslashes($row[23]);

$row[24] = addslashes($row[24]);

$row[30] = addslashes($row[30]);

// Formulate our query

$query = "INSERT INTO fcc_structure (unique_si, date_constr,

date_removed, struc_address, struc_city, struc_state, struc_height, struc_elevation, struc_ohag, struc_ohamsl, struc_type)

VALUES ({$row[4]}, '{$row[12]}', '{$row[13]}', '{$row[23]}', '{$row[24]}', '{$row[25]}', '{$row[26]}', '{$row[27]}', '{$row[28]}', '{$row[29]}', '{$row[30]}')";

// Execute our query

$result = @mysql_query($query);

if (!$result) echo("ERROR: Duplicate structure info #{$row[4]} <br>\n");

} }

fclose($handle);

C H A P T E R 5■ M A N I P U L AT I N G T H I R D - PA RT Y D ATA 104

}

echo "Done Structures. <br>\n";

// Open the Ownership Data file

$handle = @fopen("EN.dat","r");

if ($handle) {

while (!feof($handle)) {

$buffer = fgets($handle, 4096);

$row = explode("|",$buffer);

if ($row[3] > 0) {

$row[7] = addslashes($row[7]);

$row[14] = addslashes($row[14]);

$row[16] = addslashes($row[16]);

$query = "INSERT INTO fcc_owner (unique_si_own, owner_name,

owner_address, owner_city, owner_state, owner_zip) VALUES ({$row[4]}, '{$row[7]}', '{$row[14]}','{$row[16]}', '{$row[17]}', '{$row[18]}')";

$result = @mysql_query($query);

if (!$result) {

// Newer information later in the file: UPDATE instead

$query = "UPDATE fcc_owner SET owner_name='{$row[7]}',

owner_address='{$row[14]}', owner_city='{$row[16]}', owner_state='{$row[17]}', owner_zip='{$row[18]}' WHERE unique_si_own={$row[4]}";

$result = @mysql_query($query);

if (!$result)

echo "Failure to import ownership for struc. #{$row[4]}<br>\n";

else

echo "Updated ownership for struc. #{$row[4]} <br>\n";

} } }

fclose($handle);

}

echo "Done Ownership. <br>\n";

// Open the Physical Locations file

$handle = @fopen("CO.dat","r");

if ($handle) {

while (!feof($handle)) {

$buffer = fgets($handle, 4096);

$row = explode("|",$buffer);

if ($row[3] > 0) {

if ($row[9] == "S") $sign = -1; else $sign = 1;

$dec_lat = $sign*($row[6]+$row[7]/60+$row[8]/3600);

if ($row[14] == "W") $sign = -1; else $sign = 1;

$dec_long = $sign*($row[11]+$row[12]/60+$row[13]/3600);

$query = "INSERT INTO fcc_location (unique_si_loc, lat_deg, lat_min, lat_sec, lat_dir, latitude, long_deg, long_min, long_sec, long_dir, longitude) VALUES ({$row[4]},'{$row[6]}', '{$row[7]}', '{$row[8]}', '{$row[9]}', '$dec_lat','{$row[11]}', '{$row[12]}', '{$row[13]}', '{$row[14]}', '$dec_long')";

$result = @mysql_query($query);

if (!$result) {

// Newer information later in the file: UPDATE instead

$query = "UPDATE fcc_location SET lat_deg='{$row[6]}',

lat_min='{$row[7]}', lat_deg='{$row[8]}', lat_dir='{$row[9]}', latitude='$dec_lat', long_deg='{$row[11]}', long_min='{$row[12]}', long_sec='{$row[13]}', long_dir='{$row[14]}', longitude='$dec_long'

WHERE unique_si_loc='{$row[4]}'";

$result = @mysql_query($query);

if (!$result)

echo "Failure to import location for struc. #{$row[4]} <br>\n";

else

echo "Updated location for struc. #{$row[4]} <br>\n";

} } }

fclose($handle);

}

echo "Done Locations. <br>\n";

?>