Long, slow process of capturing COMELEC data

Capturing COMELEC data from its official public server is a long and slow process with frequent pauses as the server gets congested, especially on the day after elections.

Each clustered precinct has about 150 kilobytes of data and multiplying that by about 76,000 clustered precincts, that brings the amount of data to be stored to 11 gigabytes which is a lot of data to be transfered across the Internet. Most of the data is HTML formatting which is required for easy viewing by cluster or by town.

It would have been great if COMELEC had a compressed archive of all the data without the formatting which would have reduced the amount of data to be copied to probably just several megabytes only. Without this compressed data, Auza.Net implemented a creative solution to capture the data from COMELEC in a low priority manner to avoid congesting COMELEC’s website. It also captures only the new data when the program is rerun.

It is estimated that it will take about one more week to completely capture all the data at the precinct level. Municipal level data though takes only about half a day.

Compared to previous elections where there is no other way to get the data, this is already a data analyst’s nirvana.

Source: BoholNewsDaily

0 comments:

Post a Comment

Blog Archive