[lug] Speed up osm2pgsql

Bear Giles bgiles at coyotesong.com
Fri Feb 9 07:45:39 MST 2018


I forgot to mention this earlier - the quoted price per vCPU seems pretty
high. My single-vCPU AWS and DigitalOcean instances cost about $5/month.
You can't do much with them but it gives you a baseline price - it's
something like $0.003/hr on average. (I have a mix of long-term prepaid
systems and on-demand instances.)

The price you quoted sounds like the much heftier systems I'll use for
small (Cloudera Express) Hadoop clusters. They're running a lot of services.



On Fri, Feb 9, 2018 at 6:51 AM, <stimits at comcast.net> wrote:

> I was using a mix of small embedded systems to do disk copies from full
> sized systems, followed by restores to the full sized systems (using
> regular disks).
>
> While checking other uses just for curiosity it seems that any read
> operations are cached...the second time through a read was tremendously
> fast in comparison to the same thing...the little embedded system used the
> full sized system's cache over gigabit. This little tiny ARM system was
> outperforming my old work station (24 cores, 48GB of RAM, three 15k rpm
> disks striped).
>
> The reason this had intrigued me is that many years ago I had written
> management software for diskless clusters. When the clusters used their own
> disks it took about 18 minutes to boot. With NFS, despite using the same
> disks on the host, the entire cluster booted in about 25 seconds once the
> POST was done.
>
> Now if you can imagine if the iSCSI server used a RAM disk instead...you
> won't care about whether the data is preserved upon reboot, it's just swap.
> If a lot of work is being slowed down by PostgreSQL doing read-ahead into
> page tables, then it kind of defeats the purpose to read ahead from a disk
> and then swap it back out to a disk. I'm reminded of the phrase "the
> operation was a success, but the patient died". I can't guarantee it, but
> my feeling is that swap over iSCSI to a server with a lot of RAM would
> outperform local swap if for no other reason than the caching on the faster
> server...if you created a RAM disk instead on a server with lots of
> it...and use iSCSI over gigabit...or even 10gig...I think you will end up
> with something far faster than you expect, and not need to have the
> hardware physically sitting next to you.
>
> The real question is how to create a RAM disk approximately the size you
> need and then format it as a swap drive...once you have that everything
> about iSCSI is standard and not difficult if you don't need things like
> encryption or special authentication.
>
> FYI, I wouldn't touch NFS in such a situation for many reasons. iSCSI for
> all practical purposes is just a local drive to the system using it. You
> can use dd or just about anything requiring a physical disk (or loopback
> device) on iSCSI. My operations never had a single error while transferring
> 300GB images around.
>
> ----- Original Message -----
> From: Lee Woodworth <blug-mail at duboulder.com>
> To: lug at lug.boulder.co.us
> Sent: Fri, 09 Feb 2018 09:02:36 -0000 (UTC)
> Subject: Re: [lug] Speed up osm2pgsql
> On 02/08/2018 03:48 PM, stimits at comcast.net wrote:
> > Something I would ponder just because it is interesting, is that I've
> > found some very good performance in the past using iSCSI. Would it be
> > possible to have a system with this much RAM at a remote location serving
> > iSCSI off of a very large RAM disk, and mount that on the local system
> as swap?
> Were you heavily loading a software-based iscsi setup (client and target)?
> I was considering a similar idea of using swap over nfs but it has been
> reported as having problems under load in the past.
> With just straightup sequential ops (e.g. shred a large file) I have had
> glusterfs lock up where the only recovery is a client reboot. Same for
> nbd (network block device) and nfs in the 4.14 kernels.
> So I wonder about software-based iscsi reliability with a heavy load.
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20180209/82035791/attachment.html>


More information about the LUG mailing list