[aprssig] distributed findu possible ?
Steve Dimse steve at dimse.comSat Aug 9 22:05:43 UTC 2008
- Previous message: [aprssig] distributed aprs visualization possible ?
- Next message: [aprssig] findU server move time yet again
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 8, 2008, at 7:33 PM, Matti Aarnio wrote: > > If the "dump last 30 minutes of traffic" -feature could be thrown > away, > the APRS-IS server memory footprint would shrink considerably, but > that > does not help the "findu"-like systems at all, they would need N > gigabytes > of memory if it all should be kept on memory... (and "N" is very much > larger than 4 ...) n=248 GB for eight years of weather data and satellite QSOs, 120 days of positions, and 60 days of messages, telemetry, errors, and a number of other tables. > I have ran test extracts of full APRS-IS feed a few times. > I recall it being 115 000 - 130 000 APRS data records per hour. > Another view: 10-11 MB/hour of network traffic. (250 MB per 24h, > etc. > plus lookup indices.) > > Anyway: 30-40 APRS records per second all day around. That begins > to > make a serious challenge on database insert unless there is really > smart > indexing. Keep in mind that one line of APRS data can, and usually does, have more than one kind of data. One Mic-E packet might have telemetry, position, status byte, and comment. Weather can have position, weather data, and comment, etc. There are specialty tables needed for some functions, for example an incoming position is stored twice, once in the 120 day table, and once in a last posit table which stores a single position for every call. This is essential for efficiently executing the near function. Don't forget, as fast as data comes in, you have to delete the obsolete data. I run this at night, when the load on findU is lower. Night is only about 20 percent lower though as findU has a significant user base around the world. You cannot simply issue a single delete command, because of the database is locked during the delete. Instead you need to delete a couple hundred at a time, wait for everything else to catch up, and loop. The routine watches the disk io wait time and skips a cycle if the system is backed up. It takes about 7 hours to delete the 24 hours of data (excepting that which is retained permanently). So, for an average, I'd say take the APRS-IS packet rate and multiply times 8 for the database insert transaction rate. You need an index to cover any search you want to make, as a full table search of a 40 GB table would take hours. With a non-compacted table (like where old data is delected, and new data can be added anywhere in the table there is free space) the table must be locked from writes during a read. No new data can be written while a long read happens. This is why I do not allow direct access to the database. So yes, the database insert load is significant. So is the read load, the near.cgi page showing 50 stations is about 200 separate reads. A weather graph data is obtained in a single transaction, but to answer the query the database needs to seek on each data point. You'd probably have to split the incoming and outgoing loads by a factor of 10 before it comes in range of a typical PC. Steve K4HG
- Previous message: [aprssig] distributed aprs visualization possible ?
- Next message: [aprssig] findU server move time yet again
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the aprssig mailing list
