[aprssig] distributed findu possible ?
Steve Dimse steve at dimse.comSun Aug 10 15:34:57 UTC 2008
- Previous message: [aprssig] distributed findu possible ?
- Next message: [aprssig] distributed findu possible ?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 10, 2008, at 9:02 AM, Matti Aarnio wrote: > > One of the reasons that people have no idea of what findU can do, > is its > "user interface". Indeed you have supplied only backend of > things, no > frontend at all, Frontend and backend have specific meanings in dynamic web systems, typically the backend is the database and the frontend is the web server. In larger systems these are often on different physical machines. Under that standard definition there is indeed a front end on findU. And I specifically disallow anyone from using findU as a backend. I take your meaning to be that findU does not have a user-friendly way to generate the URLs. That is very intentional. findU is a worldwide system, I do not have the resources to localize a user interface in different languages. On the other hand, it is relatively simple to create forms that generate the URLs. It was and is my hope to get more people involved in creating APRS internet resources by allowing them to create their own form pages to generate the findU URLs. A handful of people have, in a few languages, and I link the ones I know of on my front page. I'd still like to see more. > and on some details like how long data is retained the > information is not given anywhere that I can spot That's because I need to vary it from time to time, as disk space wanes and the APRS IS traffic rises. I can't even keep up the info already on there, just noticed the front page says the database is 58 GB in size, that is really old info ;-) > > It is much much easier to point aprs.fi's map for the general area > of interest, > and then look at what happens around there. Of course it is. I wish I had time to program a full gmap implementation on findU. My point though is you cannot call something a distributed findU if it only has the easy features of findU. The database aspect of the aprs.fi front page is trivial. In fact, I was doing it in memory, without a database backend 12 years ago as part of APRServ, the original APRS hub program. I'm not saying aprs.fi is not useful, or wrong, or anything of any sort negative. I'm simply saying that you cannot talk about something as a findU analog if it only cherry-picks the easy stuff. > In the end the raw data may not live in the system for very long, but > those end-product views are longer-term data. > Like: > > http://aprs.fi/weather/OH2KXH/year > http://aprs.fi/telemetry/OH2RDK-5/month Talk about hard to find info, there is nothing on the home page that indicates this is available on aprs.fi. At least findU has a list of available cgi's and their parameters. This is better than I though was available there, though I still don't see a way to get anything other than the handful of preset views. Is there a way to show a detailed plot of high resolution data for an arbitrary time? >> How are you going to show month+ long tracks? > > That all means that: > - Data is kept on persistent database (no ram-only nodes) > - Its insertion must be cheap (as "quick") > - Its retrieval must be cheap (which may make the insertion less > cheap...) > > Disk space keeps growing, still the disks can handle only so many IO > operations per second because moving IO heads along the disk surface > and > spinning the disks themselves do take roughly the same time now that > they > took 10 years ago. Thus a single terabyte disk is no _faster_ to > do IOs > than single 10 GB disk. Assuming all other parameters are identical, that is true. My first hard drive was a 16 MB (yes, megabyte) drive (the size of a shoebox) I paid $3000 for in 1979. I can assure you, its throughput was far below even the slowest drive you can buy today. All drives are not created equal. High end servers use drives that spin faster (less waiting for the data you want to rotate under the head and short time needed to read and write a chunk of data) and have faster seek times (shorter time until the drive can start looking for the right sector), which are much faster than consumer class drives. findU uses six 146GB drives in a RAID 1+0 array. Data is evenly split between three pairs of drives. The striping of data into three groups in the RAID 1. Each bit of incoming data is written onto both drives of a pair, this mirroring is the RAID 0. Since each drive in a pair has identical data on it, reads can happen from either drive. So, each drive must handle one third of the writes and only one sixth of the reads. Combine this RAID performance with the high end disk performance, and you get a system that can handle maybe 10 times the throughput of a consumer drive. Not cheap, and not high capacity (my desktop Mac has 4 times the storage space of the findU servers), but fast. And I disagree that there is no change over 10 years in speed. At the low end, while the emphasis has indeed been on increasing capacity, there have been improvements in speed. Ten years ago even a desktop did not often have 7200 RPM drives, now I have one in my laptop. At the high end there have been large improvements in speed and less in capacity. > > > One needs to have multiple disks for: data mirrors so that single > disk can > fail without data loss or even service loss, _and_ for IO > parallellism. IO parallelism is about speed. Once you have parallelism that travels the internet, you lose a lot of speed. The fastest ping time is longer than the slowest seek time. If you use a distributed database that is not within a single data center, user experience will suffer. I don't consider alexa.com reliable for traffic rankings because of their non- random sample, but they have a good metric for response time. I'm proud they rank findU as very fast, at 0.7 seconds it beats 87% of web sites. As reference arrl.net is 3 seconds and qrz.com is 5 seconds. aprs.fi and aprsworld do not have numbers because they fall below the rankings at which alexa performs speed test. There are many studies that show more than a couple seconds response time adversely colors users' perception of a web site. When looking at reliability for a distributed system, you need to look at the reliability of each server to decide how much redundancy you need. No matter what, you need two copies of each bit. With a low reliability system (not just hardware, but with this volunteer system Joe goes on vacation and turns off his computer, or there is an ice storm and he loses power or internet), you probably want at least three copies. So if you want each server to have a hundredth of a findU amount of data, now you need 300 machines. Plus you need a way to recognize when one becomes unavailable and mirror the data onto another server. Just another feature to add into the magical central control of the system. I haven't heard, who is going to write this? ;-) Steve K4HG
- Previous message: [aprssig] distributed findu possible ?
- Next message: [aprssig] distributed findu possible ?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the aprssig mailing list
