Gallery2 in a load balanced or clustered environment?

smcnally

Joined: 2007-03-05
Posts: 34
Posted: Sun, 2007-07-08 17:00

Gallery2 in a load balanced or clustered environment

Has anyone used Gallery2 in a load balanced or clustered environment? Have tips for the best approach to improving availability?

We have our two identical machines (SunFire 4600) ready for configuration.
Gallery2 is running well on one of them.
We'll be working on the second one this week. (We announce Snapshot in print July 15)

I. Web app tier

We can load balance through "round robin DNS" (some requests go to Gallery2-Box1; some requests go to Gallery2-Box2).

We'd have the same Gallery2 install / configuration on both boxes.
This evens the traffic at the web and database tiers.
We'd use replication (master / master) to keep the two dbs in sync.
There'd be the issue of persisting sessions across both Boxes.

II. Database tier

We could cluster the database.

A single instance of Gallery2 could point to a clustered mysql db.
This doesn't balance the web traffic load, but handles the db load better and provides needed redundance.

With hope, this second possibility is straightforward.

I appreciate any insights and tips.

(I've found precious little re replication in the forums; I'll certainly follow up with our solution once it's in place and working.)

Gallery version 2.2.1
PHP version 5.2.1 apache2handler
Webserver Apache/2.0.55 (Unix) DAV/2 PHP/5.2.1
Database mysqlt 5.0.27-log
Toolkits SquareThumb, ImageMagick
Operating system SunOS spardehg02 5.10 Generic_118855-19 i86pc

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Sun, 2007-07-08 17:16

@locking with master/ master replication:
g2's concurrency management works with locks. the lock system is either flock or database based.
if you use database locking (site admin -> general: bottom of the page), and if the non-transactional lock operations (select/insert/update on g2_Lock) are replicated immeditely, then it should work.

but maybe it would be better if all tiers would go to the same database for the lock queries.

and if you're using flock, all tiers need to use the same g2data/ folder (not a replicated one).

@sessions:
either have all tiers go to the same db for sessions or replicate it really fast. :)
you could also change the persistance layer for sessions, it's all contained in modules/core/classes/GallerySession.class (load / save / delete).

@alternatives:
from what i've read, most people that use g2 in a load-balanced environment have just a single dedicated box for the database. a single db server is shared among all webservers.

--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Sun, 2007-07-08 18:41

Thank you.

@All tiers to same db:
The db is really the bottleneck, as near as I can figure. Currently, when we have performance issues, it's when the db's under heavy load. I fear if I had two installs pointing to the same db, it would lighten load on apache2 and each of the boxes, but the db itself would still be at least as taxed. (like it is right now ...)

@alternatives:
perhaps I should run Box2 just as the db server. I can have a replica on Box1 (where Gallery2 runs) for backup purposes, but the primary db server would be Box2 (so I can apply more resources to mysql)

@sessions:
we've mucked around in there already, IIRC - that's how we get our j2ee / oracle user sessions to "become" gallery2 user sessions. so, perhaps we should look there, too.

"Alternatives" seems the best alternative for this week: getting the dedicated mysql host and beefing up allotted resources should be helpful ... with hope.

And what about memory_limit php.ini - does bumping that help, in general (we're at 128MB)?

again, thanks -

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Sun, 2007-07-08 18:49

@sessions:
you shouldn't have to change that fail for your J2EE integration. GalleryEmbed is there for user / session management integration. (docs -> installation -> embedding)
anyhow, that's just FYI.

@memory_limit:
it's just a limit meaning that when php reaches the limit, it kills the script.
if the limit is too low (e.g. very close to the actually used amount), php slows down because it has to do more garbage collection all the time.
it can't really be too high but setting it far higher than any expected usage level doesn't make sense either.
128 MB should sure be more than enough for any operation in g2.

--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Thu, 2007-07-12 17:50

We have Multi Master Replication for mysql working well. Gallery2 is successfully running on two separate physical boxes.
The two databases are in sync in close enough to "real time." The Thumbnails and all images, however, are out of sync.

We can cron an rsync job between the image upload dir on the two boxes every minute or so. Is there a better way to handle this?

Valiant said:

Quote:
and if you're using flock, all tiers need to use the same g2data/ folder (not a replicated one).

I'll see if we can mount one g2data folder on both hosts.

I'll continue searching the forums; I appreciate any input.

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Mon, 2007-07-23 19:05

Hello - We've switched to db locking

@ db locking:

Quote:
@locking with master/ master replication:
g2's concurrency management works with locks. the lock system is either flock or database based.
if you use database locking (site admin -> general: bottom of the page), and if the non-transactional lock operations (select/insert/update on g2_Lock) are replicated immeditely, then it should work.

How, specifically, does this work? Does it use the internal DB allocations for auto_increment?

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Mon, 2007-07-30 16:24

Hello -

We're using database locking.

We're still seeing collisions in g2_CacheMap and g2_Entities that're breaking our Multi Master Replication.

So, we can't properly load balance two machine running Gallery2.

Any more info I can provide to make this more clear? Anyone have thoughts as to cause and possible solutions?

Gallery version = 2.2.1 core 1.2.0.1
PHP version = 5.2.1 apache2handler
Webserver = Apache/2.0.55 (Unix) DAV/2 PHP/5.2.1
Database = mysqlt 5.0.27-log, lock.system=flock
Toolkits = ImageMagick, SquareThumb
Acceleration = full/900, full/900
Operating system = SunOS spardehg02 5.10 Generic_118855-19 i86pc
Default theme = SnapshotV1
gettext = disabled
Locale = en_US
Browser = Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4
Rows in GalleryAccessMap table = 20082
Rows in GalleryAccessSubscriberMap table = 3858
Rows in GalleryUser table = 9700
Rows in GalleryItem table = 3858
Rows in GalleryAlbumItem table = 38
Rows in GalleryCacheMap table = 80379

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Sat, 2007-08-11 01:43

you shouldn't see any collisions for g2_Entity unless your g2_Lock table isn't synchronized across all db servers. that's crucial. best would be if all g2 servers would operate (select, update, delete) on the same g2_Lock table.

the CacheMap table isn't 100% collision-free, even without replication.
see:
http://sourceforge.net/tracker/index.php?func=detail&aid=1638106&group_id=7130&atid=107130

--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Sun, 2007-08-12 16:02

Thanks, Valiant -

@CacheMap - My presumption, then, is collisions here shouldn't preclude this set up from working. True statement?

@Entity collisions, what's the best way to operate (select, update, delete) on the same g2_Lock table across all db servers? Is this changing DB Locking preference code?

Currently, each db lock would be against its own g2_Lock table.

Per my notes above, the way we think it's CURRENTLY best for us to be set up (from handling load perspective) is two complete Gallery implementations on two physically distinct servers with clustered mysql dbs (Master / Master Replication). Web requests are load balanced between them.

Until our much beefier hardware comes in, this is what we have to work with.

again, many thanks -

 
valiant

Joined: 2003-01-04
Posts: 32509
Posted: Sun, 2007-08-12 16:44

@CacheMap: shouldn't have anything to do with master-master replication.

@Entity collisions:
if you're using a single web-frontend server, you might as well use file based locking (flock) instead of db based locking.
that seems to be the obvious choice since the locking API of G2 needs a resource that is shared among all request-handlers - which is not the case if you have multiple g2_Lock db tables.

if you want to use db locking, you'd need to make minimal changes to Gallery's storage layer.
that is, instead of using the standard db connection, use a DB connection that goes to a DB that isn't using any mysql replication stuff.

--------------
Documentation: Support / Troubleshooting | Installation, Upgrade, Configuration and Usage

 
Makea

Joined: 2005-05-11
Posts: 7
Posted: Mon, 2007-08-13 20:05

1. Are you taking advantage of Solaris 10 zones?

If not, you should! Consider creating 3 non-root zones per server, all nodes in a MySQL 5 Cluster. You would have a six node cluster across both machines. Create a 4th non-root zone for apache on each server and use resource pools to manage memory/cpu across all the zones. Zones are lightweight and take advantage of the multi-threaded, multi-core cpus in your server.

2. Buy a hardware load balancer. If price is an issue, Coyote Point makes several models under $5k. There's numerous articles online about the the disadvantages of round-robin dns. A hardware load-balancer intelligently assigns requests based on several admin-chosen algorithms, and most provide client session migration/draining. There's great software solutions, but the excellent ones are only for linux, and you'd need to buy another server.

After this year, my next project involves global-geographically based load balancing, which will utilize Solaris 10 heavily. We already do load-balancing across dozens of zones on multiple machines at work.

 
smcnally

Joined: 2007-03-05
Posts: 34
Posted: Mon, 2007-08-13 20:22

Thanks for your reply, Makea -

@Zones - We are not availing of them for this application. Currently, we're completely utilizing our CPU (generally we're at 0% idle CPU during peak times), so I don't believe zones would help us. We're running on SunFire 4200s - we're in the process of procuring beefier hardware.

@Load Balancing - We've got F5 or Cisco equipment providing load balancing already. The issue has been we can't run load balancing as Multi Master Replication is getting screwed up due to collisions in the g2_Entity and g2_Cache tables.

We've got an added Layer of Fun because we're sharing user session info across our java environment on parade.com and LAMP environment on snapshot.parade.com.

Soon as we have these issues sorted out, we'll be able to share load across two servers each with their own instance of httpd, mysql and Gallery. until we can address the collisions issues, we're stuck to one single box.

again, thanks very much, and I'll be reporting progress here in the forums.

 
schaef350

Joined: 2009-02-04
Posts: 7
Posted: Fri, 2009-08-07 20:16

Has anyone worked anymore with load balancing?

I am trying to do something somewhat similar in this post: http://gallery.menalto.com/node/90029

 
hans51
hans51's picture

Joined: 2006-07-07
Posts: 97
Posted: Wed, 2010-01-20 06:52

since mid December 2009 I use round robin - 3 servers for gallery ( and my site )
- db on one server for all 3 gallery servers
- upload to one precise (master) gallery server
- immediately after upload of new images rsync gallery-folder/ with other 2 servers ( without the --delete option to maintain existing resizes on other servers )
currently some 175000 hits/day on gallery ( all 3 servers ) and no problem so far
- NO other users at all - just 1 admin user
- db server also is used as web server but assigned only half the traffic of other 2 round robin servers - result is a load distribution to within approx 10% among all 3 servers

hans
Kriya Yoga, God and Love | Solutions of Love - Blog | Philippine islands pictures