I took the latest, 1.5.1, and performance got worse. Even serendipity is slower now. Measureably so.
I'll try 1.4.10.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 03:07
Well, I've been playing around.
I've compiled various versions of apache, with various config options.
Various versions of php with various config options.
Nothing seems to get rid of the initial stall. Although, httpd is now at the top of the list, instead of db2sysc.
I've tried APC, the Zend OPtimizer, and the Zend Core for IBM, AIX Compiler and gnu compiler. GNU compiler produced horribly slow setup. APC beats Zend OPtimizer hands down every time. I know they do different things (APC is caching, Zend Opt. optimizes code more effciently), but they don't play nice together, so you can only have 1.
This machine has been beat the last 3 days doing almost nothing but compiling.
Zend core was ok, it had a nice web interface for managing things, but it's not optimized for my machine. I can specifcy to AIX compiler some options specifically for my processors, number of processors, etc. Benchingmarkig with and without just these options is about a 3-8% improvment, depending on what I was doing.
The fastest setup I've been able to produce is:
php5.1.6, apache2.2.3,ibm_db21.5.1 (the first 1.5.1 I compiled earlier in the week, it picked up gnu compiler, that's why it was noticabbly slower)
Oddly enough, it's the latest levels of everything. php is a DSO
php4.4.4 and apache.1.3.37,ibm_db21.5.1 was a very close second. Almost negilible. Difference of 0.3 seconds. But always the 0.3 difference. This was php compiled statically into apache instead of being a DSO.
Oh, and this time around I figured out how to make DSO modules for php easily (where you just put the .so somewhere and declare it in php.ini). So ibm_db2 is now a DSO, albeit with no gain or loss in performance.
I also upgrade My os from AIX 5.3ml03 to ml05+.
DB2 is still FP11. I'm going to try 13, and to re-create the database entirely at FP13 level and just import g2 data into it.
I'm down to a 2-3 stall for most operations in g2, and httpd is at the top now, not db2sysc, but db2sysc is right there under it at 14% CPU.
BTW, don't try apache with mpm=worker configure option, seg faults on anything with ibm_db2 driver. Had to use prefork. (which is always what I had, but some reading I did, one guy said worker was faster for AIX)
Anyhow, I'm wiped out from all this building and reading (I was up till 5am last night, I really need something better to do. Actually, I have better things to do, these types of problems just eat at me though, I lay in bead just thinking of another possible solution, and of course I have to try it).
I can still produce the stall in a short test script with an IN clause (and I suspect other types of clauses do the same thing, I re-ordered all the g2 IN clauses with x=2 or x=3 or x=5 format, still a stall), under any build setup, so I'm pretty sure it's not me or how I built it at this point. DOing some reading, I found some things that say that ibm_db2 driver is optimized for windows, UNIX being an afterthought, so mabey it's just that way for us unix guys, or mabey it's just AIX. But that was just one guys blog, so don't know how much truth there is to that. I'll be posting up on the ibm_db2 mailing list and see what they have to say about it.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 04:38
So, umm, I see this config option for persistent connections and I set it to true.
That cuts it in half. I'm at 1-2 seconds now....for 40 thumbs and mod_rewrite module and image block modules for most albums is on again.
What's the impact here, other than I see 8 applications owned by g2 always connected with db2 list applications? This is fine from my perspective.
I'm almost happy. I think that's about the best I'm gonna do.
valiant
Joined: 2003-01-04
Posts: 32509
Posted: Mon, 2006-09-25 13:20
What happened to the connects. Still doing 40 connects per page with rewrite + mod_rewrite (not pathinfo)?
Since it's using persistent connections now, there shouldn't be a large performance penalty involved with that, but with x parallel users, you will run into connection limits pretty quickly.
And what about the IN () performance issue? Is that something the DB2 team should follow up with?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Mon, 2006-09-25 14:04
73Blazer, thanks for spending so much time debugging this. I'm a little obsessive about
problems too.
I don't buy the "mabey it's just that way for us unix guys, or mabey it's just AIX". I can
almost guarantee you that it was developed and extensively tested on Linux. And there have
been Linux people posting here saying that their performance is fine. One person even emulated
200 users, IIRC.
> Repro script
If I understand correctly, you can repro the hang only when using db2_prepare() and
db2_execute(). I'm not convinced that this is the same problem as you are having in Gallery...
Gallery does not use db2_execute()... it uses db2_exec() exclusively, as far as I can see.
> Multiple connects
What's the latest here? Were the 40 connects due to using path_info in your rewrite module or
not? I'd prefer to debug the Gallery code so that we find out exactly why this is
happening, rather than try to deduce possible reasons. This is supported by the fact that
things improve to almost-normal when you use persistent connections. And it tells me that the
IN() clause is not the problem. A separate problem perhaps, but not one that I think you are
hitting in Gallery. Think about it... the timestamps you put in your Gallery code did not
show a slowdown in any statement execution. I think your main problem is the number
of connects.
Unless I'm forgetting something?
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 15:06
It's the db2_exec that is the problem. 'IN' clause problem goes away with db2_prepare/db2_exectute. I initially posted the other way then corrected myself.
So how does one emulate 200 connected users?
Yeah, I know the problem is still there, I've just masked it with caching, pconnect and more effiecient apache/php, but I know, the stall is still there, just not as in your face as it was.
I still see 42 calls to connect (pconnect now) when mod_rewrite module is active. If I deactivate it goes to 2, decativate image block and it goes to 1. Mostly what's improved is the loading of the actual images, already past the initial stall. So with regular connect, I get the stall, page starts to load, images show up one-by-one (keep in mind I'm on the same network here for testing) and the page takes another 2-4 seconds to finish loading, but at least you have something to look at. With pconnect, the page finishes loading in 0.5 seconds. But still the initial stall, persists.
Yeah, I would like to know *exactly* what's causing as well.
So, one thing I noticed, you say to create the database with UTF-8 codeset and territory US, how important is that? I never specified those, and DB2 defaults to ISO-8859-1 codeset.
I gather his server is on Linux but his client was on Windows.
> UTF-8 codeset and territory US
To be honest, I forget, but I think it was to support unicode characters in the text fields.
I'm not sure if this could cause more connects; I suspect not, but I'm not sure. Might be
worth a quick test, at least so that we can exclude it as a suspect.
> pause between displaying each photo
The reason you're seeing pauses between each photo when using regular (non-persistent)
connections is probably because there seems to be a connect/disconnect happening between each
image. The symptom seems to fit.
And I doubt it's related to the IN predicate because you previously said that you get the same
response degradation even when using much smaller pages (which would have a much smaller IN
clause). That is still interesting though, and I'll try to follow up on that if I can repro it.
So again, my suggestion is to find out why Gallery (and specifically the URL Rewrite module, it
appears) is causing those multiple connects.
My URL Rewrite module is configured to use PHP's "pathinfo" (not Apache's "mod_rewrite"), with
the only active components being "Permalinks" and "URL Rewrite / Show Item". Any chance you
could switch to using PHP's "pathinfo" instead of Apache's "mod_rewrite" to see what happens?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Mon, 2006-09-25 16:38
When I enable "Download Item" my URL Rewrite module I also see a severe degradation, in which
I can see a definite pause between each image being loaded. Doing frequent "db2 list
applications" commands on the server while this is happening, I see many database connections
coming and going. Very similar to your findings.
When I disable "Download Item", the degradation and the multiple DB connections go away.
Do you have "Download Item" configured in your URL Rewrite module?
valiant
Joined: 2003-01-04
Posts: 32509
Posted: Mon, 2006-09-25 16:52
Quote:
When I enable "Download Item" my URL Rewrite module I also see a severe degradation, in which
I can see a definite pause between each image being loaded. Doing frequent "db2 list
applications" commands on the server while this is happening, I see many database connections
coming and going. Very similar to your findings.
That's pretty much why we warn to use the DownloadItem rule with pathinfo.
The warning is shown when you install the rewrite module and click configure. It's on the page where you choose between pathinfo and mod_rewrite.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Mon, 2006-09-25 16:59
That's exactly why I decided to try it... I'm wondering if that's what 73blazer is doing.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 17:55
Yeah it is what I'm doing . But that only slows the performance after the page is starting to be loaded. Even if I shut off the rewrite module completely in Site Admin, the initial page stall is still very present.
Persistant connections makes the Download Item option degradation, go away.
I used the download item option because the URL for the actual re-sized picture (which I use alot in forums) is alot cleaner. I can turn it off, but then all my posts since may in various forums won't show a picture,so I'm a little bit screwed there.
How do I switch to use PHP pathinfo? Do I have to take apache mod_rewrite module out in httpd.conf?
I used the MIcrosfot tool to run a 200connection stress test. No real problems, no timeouts, report shows the times for requests jump a few seconds when you use 200 users as opposed to 20 users, but that would be expected. I need the devel version of g2 to get lib/tools, correct? Can I just put that directory, or should I replace my whole g2 install with the devel version?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Mon, 2006-09-25 18:11
Ah, OK... I didn't realize that you were distinguishing an "initial stall" from a "between images" stall.
In that case, perhaps the "initial stall" is the IN(...) predicate problem you're having?
> Microsoft Stress Tool
So running the Microsoft stress tool you didn't see any stalling??? OK, I'm stunned at that.
Could it be a browser issue then?
> How do I switch to use PHP pathinfo?
I think you might need to un-install and re-install the URL Rewrite module. I can't find any
way to change that setting post-install.
> version of G2 needed for "lib/tools"
I'm not sure of that, but I wouldn't mix-&-match... I suggest you grab a whole development
build.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 18:21
Oh, there's stalling. I just said 200 users wasn't much worse than 20 (about 0.241 seconds longer per request on average worse, not too bad actually). But in all cases, inital page load takes, too long. Avg 0.98 seconds, to start of firt byte received, which is, my stall. If I turn off APC, first byte recieved comes in 3.72 seconds. If I put my IN clauses back, 6.712 seconds.
Can I have two gallery2 installs pointing at same DB? That should work I would think.
valiant
Joined: 2003-01-04
Posts: 32509
Posted: Mon, 2006-09-25 18:29
Quote:
Persistant connections makes the Download Item option degradation, go away.
Nope, there are still x queries per download request that would not exist without.
Just don't use pathinfo!
Use mod_rewrite instead.
disable and uninstall the g2's rewrite module. then install it again and when configuring, choose mod_rewrite instead of pathinfo.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Mon, 2006-09-25 18:36
73blazer:
OK, thanks for clarifying that.
Yes you should be able to have two Gallery installs pointing to the same database. Just get
another copy of Gallery, & copy the "config.php" from your original directory into it. I do
things like that frequently on my test system.
Valiant:
Quote:
Just don't use pathinfo!
Use mod_rewrite instead.
Not sure I understand that. I think he already uses mod_rewrite. I use pathinfo and I don't see the problem unless I enable
"Download Item".
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Mon, 2006-09-25 19:03
True, queries are still passed, and they return quite quickly (0.007212 seconds) but at least I don't have to wait for a connect with persistent connections, which was causing a slowdown when I had connect instead of pconnect.
Anyhow, I did use pathinfo when I installed that. I was wondering what you guys were talking about with this path info stuff. I remember why now.
Trying to switch now, I can't get the mod_rewrite method to pass.
I use it in other applications, and can make my own quick test, and it's definiltly loaded and working in apache.
I'm looking at another thread here where people have said they have ths problem with g2.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Tue, 2006-09-26 03:01
I just upgraded to DB2 FP13. A bug was fixed for another application (non-php), but performance benchmarking against various db2/php apps shows little to no gain.
Oh, well, I'm on bleeding edge of everything now.
Havn't looked at mod_rewrite thing yet, that has to be something stupid. But I don't think it's my problem, because deactiating and de-installing, and I still have stall on page load.
So, what else does the devel version do that could be useful here for trying to find where/what is taking the time?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Tue, 2006-09-26 09:29
> stall on page load
Well, now that you have identified that one query that runs slower when run via db2_exec() than
it does with separate db2_prepare() and db2_execute(), let's look at it from that angle.
I wonder if you can use Explain to dump the Access Plan for that statement in each of the two
scenarios. First of all, it should not be different between the two. Secondly, the DB2
Optimizer should be using the most efficient Access Plan. I'm not a good judge of whether or
not that is true, but I've seen you posting to DB2 newsgroups so you could ask there for
confirmation.
If you have the ability to run that same statement in a C or java app (not PHP) then maybe that
could confirm or eliminate PHP and ibm_db2 as suspects.
We could also try running a DB2 Monitor Snapshot for Dynamic SQL. Not sure if that will tell
us anything that we don't already know, but it might be worth a try.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Tue, 2006-09-26 15:38
Quote:
First of all, it should not be different between the two.
I've read to the contrary. The prepare/execute is supposedly better performance. I can't find it now, but it was an article on ibm.com website. It had said it outright. Right underneath it though, it says db2_execute is just a wrapper to db2_exec. So, I don' t know.
Good call with the C thing.
I took all the statments that I captured in my SQL trap from going from one page to the next and ran them from C, since I have a nice little C/DB2 app I'm developing, the whole framework was already there.
The whole lot of statments ran in 0.113095270012 seconds. Of course I wasn't parsing the results, just throwing throwing away, but the execution of the statements was what I was interested in.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Tue, 2006-09-26 21:02
You may have read something about whether or not Static SQL is faster than Dynamic SQL,
that is pretty hotly debated. But that's more a result of caching. I'm talking about only
the selection of an access plan by the DB2 Optimizer. It should choose the same access plan
for a given statement regardless if it is processed as static or dynamic, and that's what I want to confirm is true in your case.
Is your C program a CLI application that uses SQLExecDirect()? I believe SQLExecDirect() is
the equivalent of db2_exec()... just like SQLPrepare() and SQLExecute() are the equivalent of
db2_prepare() and db2_execute().
If so, then I'd say that's a fair comparison. And if the access plan selected by
the DB2 Optimizer for that statement in the two PHP scenarios is the same, then I would
strongly suspect that the problem is somewhere in either PHP or ibm_db2. And in that case
I think we'd need to instrument (put timestamps into) the ibm_db2 driver to see where it's
spending its time in both scenarios.
Side note: In the CLI traces you collected, I see SQLPrepare() and SQLExecute() being used,
not SQLExecDirect(). But I think that might be because SQLExecDirect() calls SQLPrepare() and
SQLExecute() under the covers. After all, every statement needs to be Prepared at some
point before it can be Executed... it's just a question of when the Prepare is done.
Plus, if you have the time, I would like to know if using a database created with the UTF-8
codeset and territory makes any difference. We may as well compare apples to apples.
This is just too weird. God how I hate performance problems. Give me a nice solid functional
problem anytime, but I hate performance problems.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Wed, 2006-09-27 01:44
I'm attaching a php script you can use to get the EXPLAIN info.
Make sure your EXPLAIN tables are created (run sqllib/misc/EXPLAIN.DDL).
Edit the dbname, username and password.
Un-comment the CURRENT EXPLAIN MODE stuff.
Run this script once with $separate_prep_and_exec=false and once with $separate_prep_and_exec=true.
Connect to the database and "select total_cost from explain_statement".
The total cost of the two statement executions should be identical.
You can re-comment out the EXPLAIN stuff and run the script a number of times to get average
execution times. My times are pretty consistent regardless of using separate prep and exec
or not.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Wed, 2006-09-27 03:46
Ok, I'm gonna admit somthing I probably shouldn't, I hope my customers don't see this.
Truthfully, I don't know what my C++ application is doing. I started with some examples out of db2_08_01/samples/cpp directory in an attempt to solve a problem with a certain customer, and found it's performance and ease of use very easy, so I use it.
Why don't I know, I'll spare you the explaintion here in this forum, it has something to do with the fact that, well, I'm not well versed in certain things (read: I'm a hack)
So, anyway, sorry,
it's "sqC" code, that does query's in the format of:
EXEC SQL BEGIN DECLARE SECTION;
char pv_oid[33];
EXEC SQL END DECLARE SECTION;
EXEC SQL DECLARE c43 CURSOR FOR
SELECT hex(V514LASTVERSION) FROM LCADMIN.VPMPARTMASTER WHERE
hex(OID)=:partmaster FOR FETCH ONLY;
EXEC SQL OPEN c43;
EXEC SQL FETCH c43 INTO :pv_oid;
EXEC SQL CLOSE c43;
or (prepare method, which is what I use/used for the g2 trapped SQL test)
EXEC SQL BEGIN DECLARE SECTION;
char lcatable[16];
char Stmt2[5000];
EXEC SQL END DECLARE SECTION;
MySelect="select rtrim(S_TYPE) from " + Templcaid + " where
$COID=x'"+ Tempparent + "' FOR FETCH ONLY";
strcpy(Stmt2,NULL);
strncpy(Stmt2,MySelect,MySelect.GetLengthInChar()+1);
EXEC SQL PREPARE SStmt FROM :Stmt2;
EXEC SQL DECLARE c5 CURSOR FOR SStmt;
cout << "SQLCode C/P Prepare [" << sqlca.sqlcode << "]" << endl;
EXEC SQL OPEN c5;
EXEC SQL FETCH c5 INTO :lcatable;
cout << "SQLCode C/P Fetch [" << sqlca.sqlcode << "]" << endl;
cout << "COID [" << Tempid << "] PARENT [" << Tempparent << "]" << endl;
EXEC SQL CLOSE c5;
Is that CLI, or something else? I don't really know(yes I know, sad). I don't see SQLExecute anywhere in api's/header files I"m using.
I'm in the midst of re-making my database under FP13 (as oppsed to migrating from FP4->FP11-FP13, which is where it is now, shouldn't be any difference, I know, but the old school part of me says migrations are usally inferior to fresh installs, so I eliminate the possiblity at the very least), and I'll just export/import the tables to it. I'll be sure to specify Territory and CODESET this time.
Thanks for the PHP script for explain stuff! Umm..it's 0 length?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Wed, 2006-09-27 20:31
No prob, here's the Reader's Digest version of the C application development options.
".sqC" files are source files that must be 'precompiled' by DB2 before they can be compiled.
This 'precompilation' does two things:
Transforms all the "EXEC SQL ..." stuff into valid C code.
All SQL statements that were coded with "EXEC SQL ..." are passed through the Optimizer.
At this time the Optimizer checks the Statistics and decides what type of Access Plan to use
for the statement. This Access Plan information is stored in the database, waiting for the
statement to be executed.
The result of 'precompilation' is a ".C" file, which can then be compiled, linked with DB2
libraries, and executed.
The second bullet (the selection of the Access Plan) means that this is "Embedded Static SQL",
because DB2 already knows how it will handle the statement even before the statement is
actually executed.
The alternative is "Dynamic SQL". In DB2, this means the "Call Level Interface", or CLI.
In CLI, applications are coded in pure C code (no "EXEC SQL ..." macros) in ".C" files, so the
application source code is never 'precompiled', so DB2 will not know what type of Access Plan
to use for the SQL in that application until the application explicitly 'prepares' the
statement. Instead of the "EXEC SQL ..." macros, CLI uses the APIs I mentioned above...
"SQLPrepare()" and "SQLExecute()", or "SQLExecDirect()". And since the source file is never
'precompiled', it is compiled and linked with DB2's CLI library immediately. CLI samples are
in the "sqllib/samples/cli" directory.
(Note that it is possible (but not commonly done) to combine Static and Dynamic SQL in the same
".sqC" source file. It is basically a ".sqC" file that has to be both 'precompiled' as well as
linked with both Static SQL and CLI libraries.)
So, what you've done with your ".sqC" application is Embedded Static SQL. What I'd like to test
is Dynamic SQL (CLI) using SQLExecDirect() to run the statement.
Since 'ibm_db2' is a CLI application, it is basically a 'layer' that sits on top of CLI. I
want to bypass that 'ibm_db2' layer and give the statement directly to CLI, to make sure it
performs well there. If it does, then the problem is likely in the 'ibm_db2' layer.
So what I would recommend is looking at the CLI sample program "sqllib/samples/cli/tbread.c"
and replacing the guts of it with that one SQL statement with the IN clause. (The sample
unfortunately requires certain support files that are in the same directory... various utility
routines. You can probably just remove that junk from the source file.) To compile & link the
source file, see the "bldapp" script in that same directory.
Hope that's helpful.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Wed, 2006-09-27 20:58
I created a new Database with UTF-8 CODESET and TERRITORY US. No Change in performance. Performance Numbers match exactly.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Wed, 2006-09-27 21:09
I've got the tbread.c example. I'll pass my statments through there and see what it does.
Yeah, that does help, thanks! I notice when I compile my applicaton it makes a .C file for every .sqC I have. Now I know why.
That php file you attched in the previous note is 0 length, you still want me to try that?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Wed, 2006-09-27 21:19
> UTF-8
OK, at least we can eliminate that. Thanks.
> 0-byte
Oh, sorry I misunderstood... I thought you were saying the value it returned was 0-byte.
I've been having a lot of damn problems attaching files here. Here's the script:
<?php
//Does separate db2_prepare() and db2_execute() take longer than db2_exec()?
//$separate_prep_and_exec = true;
$separate_prep_and_exec = false;
$dbname = "gallery2";
$username = "........";
$password = "........";
if (key_exists('SERVER_SOFTWARE', $_SERVER))
{
$br = '<br>';
}
else
{
$br = "\n";
}
// Connect to the database with specified username and password
$dbconn = db2_connect($dbname, $username, $password);
if ($dbconn)
{
// Un-comment these to capture EXPLAIN tabular data (requires the EXPLAIN tables to exist).
// After execution, do "select total_cost from explain_statement".
//
// $stmt = "SET CURRENT EXPLAIN MODE = YES";
// $result = run_statement($stmt);
// Begin constructing the statement
$stmt = "SELECT g2_PhotoItem.g_width, g2_PhotoItem.g_height, g2_DataItem.g_mimeType,";
$stmt .= " g2_DataItem.g_size, g2_Item.g_canContainChildren, g2_Item.g_description,";
$stmt .= " g2_Item.g_keywords, g2_Item.g_ownerId, g2_Item.g_summary, g2_Item.g_title,";
$stmt .= " g2_Item.g_viewedSinceTimestamp, g2_Item.g_originationTimestamp,";
$stmt .= " g2_FileSystemEntity.g_pathComponent, g2_ChildEntity.g_parentId, g2_Entity.g_id,";
$stmt .= " g2_Entity.g_creationTimestamp, g2_Entity.g_isLinkable, g2_Entity.g_linkId,";
$stmt .= " g2_Entity.g_modificationTimestamp, g2_Entity.g_serialNumber, g2_Entity.g_entityType,";
$stmt .= " g2_Entity.g_onLoadHandlers FROM g2_PhotoItem, g2_DataItem, g2_Item,";
$stmt .= " g2_FileSystemEntity, g2_ChildEntity, g2_Entity";
$stmt .= " WHERE g2_PhotoItem.g_id IN (";
// If using separate prep and exec, use parameter markers & pass an array of values
// for the IN clause, else hard-code the values for the IN clause.
if ($separate_prep_and_exec)
{
for ($ctr = 1; $ctr <= 40; $ctr++) // parameter markers ("?")
{
$stmt .= "?";
if ($ctr < 40)
{
$stmt .= ",";
}
}
// 73blazer's values for the IN clause
$parameterArray = array(4592,4595,4598,4601,4604,4607,4610,4613,4616,4619,
4622,4625,4628,4631,4634,4637,4640,4643,4646,4649,
4652,4655,4658,4661,4664,4667,4670,4673,4676,4679,
4682,4685,4688,4691,4694,4697,4700,4703,4706,4709);
// Larry's values for the IN clause
//for ($ctr = 0; $ctr < 40; $ctr++)
//{
// $parameterArray[] = 46 + $ctr * 2;
//}
//var_dump($parameterArray);
}
else
{
// 73blazer's values for the IN clause
$stmt .= "4592,4595,4598,4601,4604,4607,4610,4613,4616,4619,";
$stmt .= "4622,4625,4628,4631,4634,4637,4640,4643,4646,4649,";
$stmt .= "4652,4655,4658,4661,4664,4667,4670,4673,4676,4679,";
$stmt .= "4682,4685,4688,4691,4694,4697,4700,4703,4706,4709";
// Larry's values for the IN clause
//$stmt .= "46,48,50,52,54,56,58,60,62,64,66,68,70,72,74,76,78,80,82,84,";
//$stmt .= "86,88,90,92,94,96,98,100,102,104,106,108,110,112,114,116,118,120,122,124";
}
// Finish constructing the statement
$stmt .= ") AND g2_DataItem.g_id=g2_PhotoItem.g_id AND g2_Item.g_id=g2_PhotoItem.g_id";
$stmt .= " AND g2_FileSystemEntity.g_id=g2_PhotoItem.g_id";
$stmt .= " AND g2_ChildEntity.g_id=g2_PhotoItem.g_id";
$stmt .= " AND g2_Entity.g_id=g2_PhotoItem.g_id";
$starttime = microtime(true);
if ($separate_prep_and_exec)
{
$result = run_statement($stmt, $parameterArray);
}
else
{
$result = run_statement($stmt);
}
$endtime = microtime(true);
$elapsedtime = $endtime - $starttime;
printf ("${br}Elapsed time = %lf.${br}", $elapsedtime);
// Disconnect from the database
db2_close($dbconn);
}
exit;
function run_statement($statement, $parameterArray=null)
{
global $separate_prep_and_exec, $br, $dbconn;
printf("${br}Processing statement:${br}\"%s\"${br}", $statement);
if ($separate_prep_and_exec)
{
$result = db2_prepare($dbconn, $statement);
error_check("db2_prepare");
db2_execute($result, $parameterArray);
error_check("db2_execute");
}
else
{
$result = db2_exec($dbconn, $statement);
error_check("db2_exec");
}
return ($result);
}
function error_check ($action)
{
global $api, $br;
if (db2_stmt_error())
{
printf("${br}%s.${br}", db2_stmt_errormsg());
}
else
{
printf("${br}$action successful.${br}");
}
}
?>
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Wed, 2006-09-27 21:39
OK, I ran your script with true&false.
select total_cost from explain_statement; after running several times in both modes:
db2 => select total_cost from explain_statement;
All right, that's excellent! Now we're cookin' with gas!
We've confirmed that the Optimizer is selecting the same Access Path for the statement in both
cases. (The value "TOTAL_COST" represents basically the amount of processing it will take DB2
to process the statement, expressed in units called "timerons".)
And we've also confirmed that, in PHP, that statement takes much longer to execute via
db2_exec() than it does via db2_prepare() and db2_execute().
Now let's see if we can simplify our repro scenario. You can disable the EXPLAIN stuff, but now
let's simplify that SQL statement in stages, and see what happens to the time. For example,
let's reduce it to only 1 value in the IN clause. Does the 'false' time decrease?
(BTW, the EXPLAIN stuff contributes some overhead, so get a few runs of the statement AS-IS but
with EXPLAIN disabled, and then make sure you compare the "1 value" times to the "non-EXPLAIN
40-value" times.)
If the time decreases, then we know it's related to the number of values in the IN clause.
If the time does not decrease, I'd suggest making the statement as simple as possible (say,
"select g_id from g2_item") and see if the disparity still exists.
If the time disparity does go away with the simple statement, then I suggest taking the original
statement and repeatedly making it slightly more simple and observing the times. Keep going
until the disparity does go away. When the disparity does go away, the problem is related to
the last thing you removed from the statement.
(Note that simpler statements will obviously take less time than more complex ones, so we'll
need to be able to recognize 'normal' time changes from 'abnormal' time changes. For a baseline value, use the db2_prep/db2_execute time for each statement.)
Try that, and let me know what happens.
Also, it's still a good idea to do the CLI test too. It'll tell us if the problem is in CLI
or 'ibm_db2'. And if it is in 'ibm_db2', then we should be ready to open a detailed bug record
against 'ibm_db2'.
Hey, look up ahead... could that be light at the end of the tunnel?
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Wed, 2006-09-27 23:08
Ok, really weird.
As soon as I disable the EXPLAIN part, the execution times match.
Re-enable EXPLAIN for both, and db2_exec time jumps to 2 seconds. I run many times each to ensure I'm getting a reliable result.
If I knock the IN down to one value, times still match with EXPLAIN disabled, db2_exec time jumps back to 2seconds with explain enabled, which prep/exec only jumps to .18 seconds, on average.
So, well, that the heck does that mean? I deleted my test script ealier, I'm trying to dig out from backup now, it showed the 2 second time difference, but I never had explain enabled before.
I'm trying the CLI test now.....
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Wed, 2006-09-27 23:40
Ah, I just found out that the SET CURRENT EXPLAIN MODE causes only Dynamic SQL to be EXPLAINed.
"db2_exec()" is Dynamic SQL, but "db2_prepare()/db2_execute()" is actually Static SQL, so it is
not being EXPLAINed.
Sigh... I don't see a way to get EXPLAIN info for Static SQL in a PHP application. So, let's
forget about the EXPLAIN and Access Plan and all that stuff... it's highlyunlikely that the
Optimizer is selecting a different Access Plan for the same statement. Let's just assume that
it isn't.
If you can get back your other script that does show the 2 second difference, then we can try
simplifying the statement in it, as described above. Just try to find one attribute of the
statement that (significantly) affects the time taken by db2_exec().
I'm getting dizzy...
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Thu, 2006-09-28 13:42
Ummm..i'm losing face here.
I got my script.
Times match. Either I'm going crazy, or I wasn't paying attention to what I was doing before, or FP13 fixed that problem. In fact, what I see now is up until 6 items in the IN clause, and prepare/exec matches exec almost exactly every time. Add a seventh item, and prepare/exec goes out the window (jumps up to 1.5+ seconds), while db2_exec remains fast now (0.009).
Ok, enough of that. Back to my SQL trap, I need to see what's taking the time again.
Persuant to what I see in my OS monitor, httpd is at the top of the 'top' list now, not db2. But db2sysc is right there under it.
LOoking at an new SQL trap with timings (entering leaving _query), it appears to be busy the whole time. All querys are being executed in .1 or less, 99% are .00X. So, db2 seems to be performing well, which is actually what I saw in one of my timing traps way back when. (But the a few saying 2.1 threw me off, mabey FP13 fixed the IN thing, mabey I forgot to activate database because I took it offline for something..I don't know anymore..)
Man, i'm losing it. Ok, during the stall, my trap is busy spitting info the whole time, httpd is at the top, no query takes longer than 0.01 seconds to execute, and there's couple 1 or 1.3 second gaps from when it leaves _query to when it comes back to it, so that is consistent with my top report, that httpd (php) is busy.
So, I'm going back to my noted points in the code where the stall was taking place....I don't think it's really DB2 anymore.
Here;s a sample _query trap/timing http://www.snyderworld.org/ftp/g2_querys
Well, I'm sure the earth under your feet changed when you upgraded all your software. Probably
a good idea to go back to square 1, and not rely on anything we saw earlier.
OK, so we know that the time is being spent outside of _query(). I think it follows then that
that also means the time is not being spent in anything beyond _query() in the call stack.
Specifically, I'm thinking about ibm_db2 and DB2 itself... if they were slow, that time would
show up in the _query() times.
So the problem now seems to be somewhere in either ADOdb (before _query()), Gallery, PHP, Apache, or AIX.
I'd suggest we try to eliminate them in that order:
Remove all existing timings, put some timings in the ADOdb code at the precise point where it takes over from the Gallery code and when it returns to the Gallery code, and re-run the test. Look at the proportion of time spent in ADOdb (and beyond) versus the time NOT spent in ADOdb (and beyond).
Remove all existing timings, put some timings in Gallery at the precise point where it starts and the exact point where it returns. Look at the proportion of time spent in Gallery
(and beyond) versus the time NOT spent in Gallery (and beyond).
Or instead of trying to time one layer at a time, you could just leave all your timings in the
code and do the analysis of all layers at once.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Thu, 2006-09-28 15:11
I'm going step by step back...
I did get the devel version of g2 going, is there anything useful in there, timing tests, or anything I can use there?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Thu, 2006-09-28 15:24
> anything useful in [devel version of g2], timing tests, or anything I can use there?
Not that I know of, but I don't know for certain. If Valiant is still hanging around, perhaps
he knows?
valiant
Joined: 2003-01-04
Posts: 32509
Posted: Thu, 2006-09-28 16:27
There's just some slow query profiling in G2. See the part about $gallery->setProfile(array('sql')); in config.php.
In fact, I find what you write alarming:
Quote:
In fact, what I see now is up until 6 items in the IN clause, and prepare/exec matches exec almost exactly every time. Add a seventh item, and prepare/exec goes out the window (jumps up to 1.5+ seconds), while db2_exec remains fast now (0.009).
So the IN () clause issue exists. If the two approach diverge starting with 6 elements in the IN predicate, then this is a problem the DB2 guys should be aware of.
Or has it to do with the total query size (number of charasters of the SQL query string)? Maybe the query buffer is too small?
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Thu, 2006-09-28 16:41
> IN predicate performance
I agree it is alarming, but G2 does not do separate prepare/execute... it does only db2_exec().
So yes it is an issue that I'd like to follow up on eventually, but more important right now
are things that affect G2.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Thu, 2006-09-28 17:57
Ummmmmmmmmmmmmmm
You guys are gonna hate me.
I just figured it out.
Are you ready....
Are you sure.....
THis is really, really really stupid......
I feel like an XXX for wasting all of your time like this........
Not to mention countless hours of my own time...
But, hey, lets look on the bright side...
There IS a problem withe DB2 IN clause, I'll follow up on where that's happening (CLI livel or ibm_db2 level) and try to get that taken care of.
But...this performance problem....ugh.......
So, rememeber I said in my OS monitor, my hard disk was doing pretty much nothing.
Well, there's a little note there to say if you want better statics, run this command (basically just opens up a kernal extention to read more detail)
So my disk report was showing ---- (more dahses means more activity)
Well, I ran that command, and now I see RRWWWWRRWWWWW instead of dashes.
I always saw a few dashes there thinking that's just the actual retrieval of the picture. Now I'm looking at it going, why do I see WWWW, I shouldn't see writes...
So I poke around bit to see what's going on
I had my apache user's umask set to 200 (-w-------). Don't ask me how that happened, that's another story. In my data dir, the albums directory had sticky bit of 0200 (so it could reads it's own files, even though it only had write permission). The cache and smarty directorys wern't so fortunate, so every time a page was requested, even though all the cache, fast-inc files were there, it kept having to re-make them. I never saw the convert command in my top list, so I don't know about the thumbs themselvs, how it was getting those without re-converting them every time. ANyhow, I change umask to something more proper and, now, finally, my problems are solved. No more stalling. Well, at least the second time you hit a page. I hit next ,and the page loads and my thumbs come in, a 40thumb page takes about 1.2 to complete loaded. It was 1-2 seconds before, with APC turned on, just to begin loading the page.
So, well, at least we know what the problem was now.
I'm terribly sorry for taking up your time on this problem. It was driving me nuts though, I knew it couldn't be that slow.
I do very much appreciate your help. If I can help in any way to advace the project of G2,I'd be happy to offer my services.(That is if you trust my fat fingers)
And I'm almost done, with the initial phase, of porting serendipity BLOG software to DB2, another positive that came from all this. One last hurdle there, LIMIT/OFFSET which they use ALOT and DB2 has no LIMIT/OFFSET clause. (I know, it's not standard SQL, that's why it's not there, those other DB's are cheating.)
Thanks again, I think this one is licked, and it was all my fault to begin with
Whew, 60+ posts on that....brutal..
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Thu, 2006-09-28 18:19
Ah-haa... No problem. Like you said there are positives that came out of it. Another positive
is... it looks like you've now upgraded your entire system.
Can you still repro that IN predicate problem? If so, could you attach or post your script
here? Thanks.
It's at 6 right now. Add one more value to it and execution time jumps considerably for the prep/exec mehod.
I'm trying it in CLI now...
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Thu, 2006-09-28 23:10
Confirmed:
A simple compiled CLI application exhibits same performance problem.
6 items in IN clause, speeds right through, 7 items, time for a sip of tea.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Thu, 2006-09-28 23:25
OK, that's still fascinating because I can't repro the difference in PHP. Would you mind
sharing a copy of the simple CLI program?
You don't happen to have any funky "PATCHn=...." settings in your "cli.ini" file, do you?
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Fri, 2006-09-29 00:01
Yeah..i'll post it.
I may have place the time function in not such a god place though...i'm getting inconsistent results. php program shows it, all the time.
give me a few minutes with the CLI thing here while I carefully place the time functions.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Fri, 2006-09-29 00:50
OK, here's the output from my CLI program.
To be fair, the timiing for prep/exec is starting prior to SQLPrepare and ends after SQLExecute. But with 6 parameters, prep/exec is faster.
Oh, and opps, microseconds should be seconds...
Connecting to wwwdb...
Connected to wwwdb.
TO PERFORM A BASIC SELECT USING SQLGetData6:
Directly execute the statement
Data retrieved.
QueryTime= 0.005996 microseconds.
TO PERFORM A BASIC SELECT USING SQLGetData7:
Directly execute the statement
Data retrieved.
QueryTime= 0.003580 microseconds.
====================================
====================================
====================================
====================================
====================================
TO PERFORM A SELECT PREP/EXEC WITH 6 PARAMETERS
Prepare the statement
Binding params to the statement
Data retrieved.
QueryTime= 0.003801 microseconds.
TO PERFORM A SELECT PREP/EXEC WITH 7 PARAMETERS:
Prepare the statement
Binding params to the statement
Data retrieved.
QueryTime= 0.187981 microseconds.
Oh yeah.
I'm smokin' now.
Turned back on APC, turnd on partial accelleration..
I can't click fast enough....
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Fri, 2006-09-29 09:22
Yeah, but the real question is... what are you smokin'?
Sorry, I just couldn't resist.
BTW, I can't compile your CLI program... IIRC, the time functions are different on Windows (or
at least the header file for time functions is), but it shouldn't be tough for me to fix once I
get the time to do it.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Fri, 2006-09-29 12:19
CPU Clock cycles....that's what I'm smokin!
I tried to pick one that was. I thought struct timeval and
gettimeofday were good for windows.
Try to put
#include "win32/time.h" instead
I took them out of php's microtime function for just that reason. They should be good.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Fri, 2006-09-29 20:11
Hmm, nope, doesn't help. I'll have to hit MSDN to figure out how to use time functions.
This is about the 3rd or 4th time I've had to go to MSDN to figure this out (I have to do this about once every 5 or 6 years). Maybe I should save a little sample program locally.
Larry Menard
Joined: 2005-10-01
Posts: 757
Posted: Fri, 2006-09-29 20:57
Ah, the header file for time functions is "WinSock2.h". That's not bloody obvious, is it?
I still can't compile it though... now I get a lot of errors about the SQLINTEGER type. I
think the root error is:
Quote:
tbread.c(228) : error C2275: 'SQLINTEGER' : illegal use of this type as an expression
c:\Program Files\IBM\SQLLIB\include\sqlcli.h(364) : see declaration of 'SQLINTEGER'
But when I take that structure definition and paste it into the existing tbread sample, it
compiles fine. So there must be something else in your tbread that is breaking it?
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Fri, 2006-09-29 23:23
Um, I'm not sure. I've never compiled anything on windows.
I took the definition from an examples in samples/cli directory. Poke around in there and see if they define integer varibale arrays as some other type? But I think those files should be the same as mine, they don't put different C code for different platforms, do they?
Not too much is coming on a search of that error message.
73blazer
Joined: 2006-05-12
Posts: 79
Posted: Fri, 2006-09-29 23:41
I think the SQLBindParameter is having trouble with it. I had to play with the bind line a few times to get it right.
Try to change the array definition line from SQLINTEGER to SQL_C_NUMERIC
Posts: 79
I took the latest, 1.5.1, and performance got worse. Even serendipity is slower now. Measureably so.
I'll try 1.4.10.
Posts: 79
Well, I've been playing around.
I've compiled various versions of apache, with various config options.
Various versions of php with various config options.
Nothing seems to get rid of the initial stall. Although, httpd is now at the top of the list, instead of db2sysc.
I've tried APC, the Zend OPtimizer, and the Zend Core for IBM, AIX Compiler and gnu compiler. GNU compiler produced horribly slow setup. APC beats Zend OPtimizer hands down every time. I know they do different things (APC is caching, Zend Opt. optimizes code more effciently), but they don't play nice together, so you can only have 1.
This machine has been beat the last 3 days doing almost nothing but compiling.
Zend core was ok, it had a nice web interface for managing things, but it's not optimized for my machine. I can specifcy to AIX compiler some options specifically for my processors, number of processors, etc. Benchingmarkig with and without just these options is about a 3-8% improvment, depending on what I was doing.
The fastest setup I've been able to produce is:
php5.1.6, apache2.2.3,ibm_db21.5.1 (the first 1.5.1 I compiled earlier in the week, it picked up gnu compiler, that's why it was noticabbly slower)
Oddly enough, it's the latest levels of everything. php is a DSO
php4.4.4 and apache.1.3.37,ibm_db21.5.1 was a very close second. Almost negilible. Difference of 0.3 seconds. But always the 0.3 difference. This was php compiled statically into apache instead of being a DSO.
Oh, and this time around I figured out how to make DSO modules for php easily (where you just put the .so somewhere and declare it in php.ini). So ibm_db2 is now a DSO, albeit with no gain or loss in performance.
I also upgrade My os from AIX 5.3ml03 to ml05+.
DB2 is still FP11. I'm going to try 13, and to re-create the database entirely at FP13 level and just import g2 data into it.
I'm down to a 2-3 stall for most operations in g2, and httpd is at the top now, not db2sysc, but db2sysc is right there under it at 14% CPU.
BTW, don't try apache with mpm=worker configure option, seg faults on anything with ibm_db2 driver. Had to use prefork. (which is always what I had, but some reading I did, one guy said worker was faster for AIX)
Anyhow, I'm wiped out from all this building and reading (I was up till 5am last night, I really need something better to do. Actually, I have better things to do, these types of problems just eat at me though, I lay in bead just thinking of another possible solution, and of course I have to try it).
I can still produce the stall in a short test script with an IN clause (and I suspect other types of clauses do the same thing, I re-ordered all the g2 IN clauses with x=2 or x=3 or x=5 format, still a stall), under any build setup, so I'm pretty sure it's not me or how I built it at this point. DOing some reading, I found some things that say that ibm_db2 driver is optimized for windows, UNIX being an afterthought, so mabey it's just that way for us unix guys, or mabey it's just AIX. But that was just one guys blog, so don't know how much truth there is to that. I'll be posting up on the ibm_db2 mailing list and see what they have to say about it.
Posts: 79
So, umm, I see this config option for persistent connections and I set it to true.
That cuts it in half. I'm at 1-2 seconds now....for 40 thumbs and mod_rewrite module and image block modules for most albums is on again.
What's the impact here, other than I see 8 applications owned by g2 always connected with db2 list applications? This is fine from my perspective.
I'm almost happy. I think that's about the best I'm gonna do.
Posts: 32509
What happened to the connects. Still doing 40 connects per page with rewrite + mod_rewrite (not pathinfo)?
Since it's using persistent connections now, there shouldn't be a large performance penalty involved with that, but with x parallel users, you will run into connection limits pretty quickly.
And what about the IN () performance issue? Is that something the DB2 team should follow up with?
Posts: 757
73Blazer, thanks for spending so much time debugging this. I'm a little obsessive about
problems too.
I don't buy the "mabey it's just that way for us unix guys, or mabey it's just AIX". I can
almost guarantee you that it was developed and extensively tested on Linux. And there have
been Linux people posting here saying that their performance is fine. One person even emulated
200 users, IIRC.
> Repro script
If I understand correctly, you can repro the hang only when using db2_prepare() and
db2_execute(). I'm not convinced that this is the same problem as you are having in Gallery...
Gallery does not use db2_execute()... it uses db2_exec() exclusively, as far as I can see.
> Multiple connects
What's the latest here? Were the 40 connects due to using path_info in your rewrite module or
not? I'd prefer to debug the Gallery code so that we find out exactly why this is
happening, rather than try to deduce possible reasons. This is supported by the fact that
things improve to almost-normal when you use persistent connections. And it tells me that the
IN() clause is not the problem. A separate problem perhaps, but not one that I think you are
hitting in Gallery. Think about it... the timestamps you put in your Gallery code did not
show a slowdown in any statement execution. I think your main problem is the number
of connects.
Unless I'm forgetting something?
Posts: 79
It's the db2_exec that is the problem. 'IN' clause problem goes away with db2_prepare/db2_exectute. I initially posted the other way then corrected myself.
So how does one emulate 200 connected users?
Yeah, I know the problem is still there, I've just masked it with caching, pconnect and more effiecient apache/php, but I know, the stall is still there, just not as in your face as it was.
I still see 42 calls to connect (pconnect now) when mod_rewrite module is active. If I deactivate it goes to 2, decativate image block and it goes to 1. Mostly what's improved is the loading of the actual images, already past the initial stall. So with regular connect, I get the stall, page starts to load, images show up one-by-one (keep in mind I'm on the same network here for testing) and the page takes another 2-4 seconds to finish loading, but at least you have something to look at. With pconnect, the page finishes loading in 0.5 seconds. But still the initial stall, persists.
Yeah, I would like to know *exactly* what's causing as well.
So, one thing I noticed, you say to create the database with UTF-8 codeset and territory US, how important is that? I never specified those, and DB2 defaults to ISO-8859-1 codeset.
Posts: 757
> how does one emulate 200 connected users?
See Sphericus' posting: http://gallery.menalto.com/node/37648?page=17#comment-186478
I gather his server is on Linux but his client was on Windows.
> UTF-8 codeset and territory US
To be honest, I forget, but I think it was to support unicode characters in the text fields.
I'm not sure if this could cause more connects; I suspect not, but I'm not sure. Might be
worth a quick test, at least so that we can exclude it as a suspect.
> pause between displaying each photo
The reason you're seeing pauses between each photo when using regular (non-persistent)
connections is probably because there seems to be a connect/disconnect happening between each
image. The symptom seems to fit.
And I doubt it's related to the IN predicate because you previously said that you get the same
response degradation even when using much smaller pages (which would have a much smaller IN
clause). That is still interesting though, and I'll try to follow up on that if I can repro it.
So again, my suggestion is to find out why Gallery (and specifically the URL Rewrite module, it
appears) is causing those multiple connects.
My URL Rewrite module is configured to use PHP's "pathinfo" (not Apache's "mod_rewrite"), with
the only active components being "Permalinks" and "URL Rewrite / Show Item". Any chance you
could switch to using PHP's "pathinfo" instead of Apache's "mod_rewrite" to see what happens?
Posts: 757
When I enable "Download Item" my URL Rewrite module I also see a severe degradation, in which
I can see a definite pause between each image being loaded. Doing frequent "db2 list
applications" commands on the server while this is happening, I see many database connections
coming and going. Very similar to your findings.
When I disable "Download Item", the degradation and the multiple DB connections go away.
Do you have "Download Item" configured in your URL Rewrite module?
Posts: 32509
That's pretty much why we warn to use the DownloadItem rule with pathinfo.
The warning is shown when you install the rewrite module and click configure. It's on the page where you choose between pathinfo and mod_rewrite.
Posts: 757
That's exactly why I decided to try it... I'm wondering if that's what 73blazer is doing.
Posts: 79
Yeah it is what I'm doing . But that only slows the performance after the page is starting to be loaded. Even if I shut off the rewrite module completely in Site Admin, the initial page stall is still very present.
Persistant connections makes the Download Item option degradation, go away.
I used the download item option because the URL for the actual re-sized picture (which I use alot in forums) is alot cleaner. I can turn it off, but then all my posts since may in various forums won't show a picture,so I'm a little bit screwed there.
How do I switch to use PHP pathinfo? Do I have to take apache mod_rewrite module out in httpd.conf?
I used the MIcrosfot tool to run a 200connection stress test. No real problems, no timeouts, report shows the times for requests jump a few seconds when you use 200 users as opposed to 20 users, but that would be expected. I need the devel version of g2 to get lib/tools, correct? Can I just put that directory, or should I replace my whole g2 install with the devel version?
Posts: 757
Ah, OK... I didn't realize that you were distinguishing an "initial stall" from a "between images" stall.
In that case, perhaps the "initial stall" is the IN(...) predicate problem you're having?
> Microsoft Stress Tool
So running the Microsoft stress tool you didn't see any stalling??? OK, I'm stunned at that.
Could it be a browser issue then?
> How do I switch to use PHP pathinfo?
I think you might need to un-install and re-install the URL Rewrite module. I can't find any
way to change that setting post-install.
> version of G2 needed for "lib/tools"
I'm not sure of that, but I wouldn't mix-&-match... I suggest you grab a whole development
build.
Posts: 79
Oh, there's stalling. I just said 200 users wasn't much worse than 20 (about 0.241 seconds longer per request on average worse, not too bad actually). But in all cases, inital page load takes, too long. Avg 0.98 seconds, to start of firt byte received, which is, my stall. If I turn off APC, first byte recieved comes in 3.72 seconds. If I put my IN clauses back, 6.712 seconds.
Can I have two gallery2 installs pointing at same DB? That should work I would think.
Posts: 32509
Nope, there are still x queries per download request that would not exist without.
Just don't use pathinfo!
Use mod_rewrite instead.
disable and uninstall the g2's rewrite module. then install it again and when configuring, choose mod_rewrite instead of pathinfo.
Posts: 757
73blazer:
OK, thanks for clarifying that.
Yes you should be able to have two Gallery installs pointing to the same database. Just get
another copy of Gallery, & copy the "config.php" from your original directory into it. I do
things like that frequently on my test system.
Valiant:
Not sure I understand that. I think he already uses mod_rewrite. I use pathinfo and I don't see the problem unless I enable
"Download Item".
Posts: 79
True, queries are still passed, and they return quite quickly (0.007212 seconds) but at least I don't have to wait for a connect with persistent connections, which was causing a slowdown when I had connect instead of pconnect.
Anyhow, I did use pathinfo when I installed that. I was wondering what you guys were talking about with this path info stuff. I remember why now.
Trying to switch now, I can't get the mod_rewrite method to pass.
I use it in other applications, and can make my own quick test, and it's definiltly loaded and working in apache.
I'm looking at another thread here where people have said they have ths problem with g2.
Posts: 79
I just upgraded to DB2 FP13. A bug was fixed for another application (non-php), but performance benchmarking against various db2/php apps shows little to no gain.
Oh, well, I'm on bleeding edge of everything now.
Havn't looked at mod_rewrite thing yet, that has to be something stupid. But I don't think it's my problem, because deactiating and de-installing, and I still have stall on page load.
So, what else does the devel version do that could be useful here for trying to find where/what is taking the time?
Posts: 757
> stall on page load
Well, now that you have identified that one query that runs slower when run via db2_exec() than
it does with separate db2_prepare() and db2_execute(), let's look at it from that angle.
scenarios. First of all, it should not be different between the two. Secondly, the DB2
Optimizer should be using the most efficient Access Plan. I'm not a good judge of whether or
not that is true, but I've seen you posting to DB2 newsgroups so you could ask there for
confirmation.
could confirm or eliminate PHP and ibm_db2 as suspects.
us anything that we don't already know, but it might be worth a try.
Posts: 79
I've read to the contrary. The prepare/execute is supposedly better performance. I can't find it now, but it was an article on ibm.com website. It had said it outright. Right underneath it though, it says db2_execute is just a wrapper to db2_exec. So, I don' t know.
Good call with the C thing.
I took all the statments that I captured in my SQL trap from going from one page to the next and ran them from C, since I have a nice little C/DB2 app I'm developing, the whole framework was already there.
The whole lot of statments ran in 0.113095270012 seconds. Of course I wasn't parsing the results, just throwing throwing away, but the execution of the statements was what I was interested in.
Posts: 757
You may have read something about whether or not Static SQL is faster than Dynamic SQL,
that is pretty hotly debated. But that's more a result of caching. I'm talking about only
the selection of an access plan by the DB2 Optimizer. It should choose the same access plan
for a given statement regardless if it is processed as static or dynamic, and that's what I want to confirm is true in your case.
Is your C program a CLI application that uses SQLExecDirect()? I believe SQLExecDirect() is
the equivalent of db2_exec()... just like SQLPrepare() and SQLExecute() are the equivalent of
db2_prepare() and db2_execute().
If so, then I'd say that's a fair comparison. And if the access plan selected by
the DB2 Optimizer for that statement in the two PHP scenarios is the same, then I would
strongly suspect that the problem is somewhere in either PHP or ibm_db2. And in that case
I think we'd need to instrument (put timestamps into) the ibm_db2 driver to see where it's
spending its time in both scenarios.
Side note: In the CLI traces you collected, I see SQLPrepare() and SQLExecute() being used,
not SQLExecDirect(). But I think that might be because SQLExecDirect() calls SQLPrepare() and
SQLExecute() under the covers. After all, every statement needs to be Prepared at some
point before it can be Executed... it's just a question of when the Prepare is done.
Plus, if you have the time, I would like to know if using a database created with the UTF-8
codeset and territory makes any difference. We may as well compare apples to apples.
This is just too weird. God how I hate performance problems. Give me a nice solid functional
problem anytime, but I hate performance problems.
Posts: 757
I'm attaching a php script you can use to get the EXPLAIN info.
The total cost of the two statement executions should be identical.
You can re-comment out the EXPLAIN stuff and run the script a number of times to get average
execution times. My times are pretty consistent regardless of using separate prep and exec
or not.
Posts: 79
Ok, I'm gonna admit somthing I probably shouldn't, I hope my customers don't see this.
Truthfully, I don't know what my C++ application is doing. I started with some examples out of db2_08_01/samples/cpp directory in an attempt to solve a problem with a certain customer, and found it's performance and ease of use very easy, so I use it.
Why don't I know, I'll spare you the explaintion here in this forum, it has something to do with the fact that, well, I'm not well versed in certain things (read: I'm a hack)
So, anyway, sorry,
it's "sqC" code, that does query's in the format of:
or (prepare method, which is what I use/used for the g2 trapped SQL test)
Is that CLI, or something else? I don't really know(yes I know, sad). I don't see SQLExecute anywhere in api's/header files I"m using.
I'm in the midst of re-making my database under FP13 (as oppsed to migrating from FP4->FP11-FP13, which is where it is now, shouldn't be any difference, I know, but the old school part of me says migrations are usally inferior to fresh installs, so I eliminate the possiblity at the very least), and I'll just export/import the tables to it. I'll be sure to specify Territory and CODESET this time.
Thanks for the PHP script for explain stuff! Umm..it's 0 length?
Posts: 757
No prob, here's the Reader's Digest version of the C application development options.
".sqC" files are source files that must be 'precompiled' by DB2 before they can be compiled.
This 'precompilation' does two things:
At this time the Optimizer checks the Statistics and decides what type of Access Plan to use
for the statement. This Access Plan information is stored in the database, waiting for the
statement to be executed.
The result of 'precompilation' is a ".C" file, which can then be compiled, linked with DB2
libraries, and executed.
The second bullet (the selection of the Access Plan) means that this is "Embedded Static SQL",
because DB2 already knows how it will handle the statement even before the statement is
actually executed.
The alternative is "Dynamic SQL". In DB2, this means the "Call Level Interface", or CLI.
In CLI, applications are coded in pure C code (no "EXEC SQL ..." macros) in ".C" files, so the
application source code is never 'precompiled', so DB2 will not know what type of Access Plan
to use for the SQL in that application until the application explicitly 'prepares' the
statement. Instead of the "EXEC SQL ..." macros, CLI uses the APIs I mentioned above...
"SQLPrepare()" and "SQLExecute()", or "SQLExecDirect()". And since the source file is never
'precompiled', it is compiled and linked with DB2's CLI library immediately. CLI samples are
in the "sqllib/samples/cli" directory.
(Note that it is possible (but not commonly done) to combine Static and Dynamic SQL in the same
".sqC" source file. It is basically a ".sqC" file that has to be both 'precompiled' as well as
linked with both Static SQL and CLI libraries.)
So, what you've done with your ".sqC" application is Embedded Static SQL. What I'd like to test
is Dynamic SQL (CLI) using SQLExecDirect() to run the statement.
Since 'ibm_db2' is a CLI application, it is basically a 'layer' that sits on top of CLI. I
want to bypass that 'ibm_db2' layer and give the statement directly to CLI, to make sure it
performs well there. If it does, then the problem is likely in the 'ibm_db2' layer.
So what I would recommend is looking at the CLI sample program "sqllib/samples/cli/tbread.c"
and replacing the guts of it with that one SQL statement with the IN clause. (The sample
unfortunately requires certain support files that are in the same directory... various utility
routines. You can probably just remove that junk from the source file.) To compile & link the
source file, see the "bldapp" script in that same directory.
Hope that's helpful.
Posts: 79
I created a new Database with UTF-8 CODESET and TERRITORY US. No Change in performance. Performance Numbers match exactly.
Posts: 79
I've got the tbread.c example. I'll pass my statments through there and see what it does.
Yeah, that does help, thanks! I notice when I compile my applicaton it makes a .C file for every .sqC I have. Now I know why.
That php file you attched in the previous note is 0 length, you still want me to try that?
Posts: 757
> UTF-8
OK, at least we can eliminate that. Thanks.
> 0-byte
Oh, sorry I misunderstood... I thought you were saying the value it returned was 0-byte.
I've been having a lot of damn problems attaching files here. Here's the script:
Posts: 79
OK, I ran your script with true&false.
select total_cost from explain_statement; after running several times in both modes:
db2 => select total_cost from explain_statement;
TOTAL_COST
------------------------
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
+0.00000000000000E+000
+2.21002990722656E+002
20 record(s) selected.
Execution time printed in your script is 0.18 seconds(true) vs. 2.1 seconds(false)
Try it yourself:
http://www.snyderworld.org/test/lm_exec.php (false)
http://www.snyderworld.org/test/lm_prepexec.php (true)
Posts: 757
All right, that's excellent! Now we're cookin' with gas!
We've confirmed that the Optimizer is selecting the same Access Path for the statement in both
cases. (The value "TOTAL_COST" represents basically the amount of processing it will take DB2
to process the statement, expressed in units called "timerons".)
And we've also confirmed that, in PHP, that statement takes much longer to execute via
db2_exec() than it does via db2_prepare() and db2_execute().
Now let's see if we can simplify our repro scenario. You can disable the EXPLAIN stuff, but now
let's simplify that SQL statement in stages, and see what happens to the time. For example,
let's reduce it to only 1 value in the IN clause. Does the 'false' time decrease?
(BTW, the EXPLAIN stuff contributes some overhead, so get a few runs of the statement AS-IS but
with EXPLAIN disabled, and then make sure you compare the "1 value" times to the "non-EXPLAIN
40-value" times.)
If the time decreases, then we know it's related to the number of values in the IN clause.
If the time does not decrease, I'd suggest making the statement as simple as possible (say,
"select g_id from g2_item") and see if the disparity still exists.
If the time disparity does go away with the simple statement, then I suggest taking the original
statement and repeatedly making it slightly more simple and observing the times. Keep going
until the disparity does go away. When the disparity does go away, the problem is related to
the last thing you removed from the statement.
(Note that simpler statements will obviously take less time than more complex ones, so we'll
need to be able to recognize 'normal' time changes from 'abnormal' time changes. For a baseline value, use the db2_prep/db2_execute time for each statement.)
Try that, and let me know what happens.
Also, it's still a good idea to do the CLI test too. It'll tell us if the problem is in CLI
or 'ibm_db2'. And if it is in 'ibm_db2', then we should be ready to open a detailed bug record
against 'ibm_db2'.
Hey, look up ahead... could that be light at the end of the tunnel?
Posts: 79
Ok, really weird.
As soon as I disable the EXPLAIN part, the execution times match.
Re-enable EXPLAIN for both, and db2_exec time jumps to 2 seconds. I run many times each to ensure I'm getting a reliable result.
If I knock the IN down to one value, times still match with EXPLAIN disabled, db2_exec time jumps back to 2seconds with explain enabled, which prep/exec only jumps to .18 seconds, on average.
So, well, that the heck does that mean? I deleted my test script ealier, I'm trying to dig out from backup now, it showed the 2 second time difference, but I never had explain enabled before.
I'm trying the CLI test now.....
Posts: 757
Ah, I just found out that the SET CURRENT EXPLAIN MODE causes only Dynamic SQL to be EXPLAINed.
"db2_exec()" is Dynamic SQL, but "db2_prepare()/db2_execute()" is actually Static SQL, so it is
not being EXPLAINed.
Sigh... I don't see a way to get EXPLAIN info for Static SQL in a PHP application. So, let's
forget about the EXPLAIN and Access Plan and all that stuff... it's highlyunlikely that the
Optimizer is selecting a different Access Plan for the same statement. Let's just assume that
it isn't.
If you can get back your other script that does show the 2 second difference, then we can try
simplifying the statement in it, as described above. Just try to find one attribute of the
statement that (significantly) affects the time taken by db2_exec().
I'm getting dizzy...
Posts: 79
Ummm..i'm losing face here.
I got my script.
Times match. Either I'm going crazy, or I wasn't paying attention to what I was doing before, or FP13 fixed that problem. In fact, what I see now is up until 6 items in the IN clause, and prepare/exec matches exec almost exactly every time. Add a seventh item, and prepare/exec goes out the window (jumps up to 1.5+ seconds), while db2_exec remains fast now (0.009).
Ok, enough of that. Back to my SQL trap, I need to see what's taking the time again.
Persuant to what I see in my OS monitor, httpd is at the top of the 'top' list now, not db2. But db2sysc is right there under it.
LOoking at an new SQL trap with timings (entering leaving _query), it appears to be busy the whole time. All querys are being executed in .1 or less, 99% are .00X. So, db2 seems to be performing well, which is actually what I saw in one of my timing traps way back when. (But the a few saying 2.1 threw me off, mabey FP13 fixed the IN thing, mabey I forgot to activate database because I took it offline for something..I don't know anymore..)
Man, i'm losing it. Ok, during the stall, my trap is busy spitting info the whole time, httpd is at the top, no query takes longer than 0.01 seconds to execute, and there's couple 1 or 1.3 second gaps from when it leaves _query to when it comes back to it, so that is consistent with my top report, that httpd (php) is busy.
So, I'm going back to my noted points in the code where the stall was taking place....I don't think it's really DB2 anymore.
Here;s a sample _query trap/timing
http://www.snyderworld.org/ftp/g2_querys
Parsed up a little:
http://www.snyderworld.org/ftp/g2_querys_enterleave
http://www.snyderworld.org/ftp/g2_querys_querytook
Posts: 757
Well, I'm sure the earth under your feet changed when you upgraded all your software. Probably
a good idea to go back to square 1, and not rely on anything we saw earlier.
OK, so we know that the time is being spent outside of _query(). I think it follows then that
that also means the time is not being spent in anything beyond _query() in the call stack.
Specifically, I'm thinking about ibm_db2 and DB2 itself... if they were slow, that time would
show up in the _query() times.
So the problem now seems to be somewhere in either ADOdb (before _query()), Gallery, PHP, Apache, or AIX.
I'd suggest we try to eliminate them in that order:
(and beyond) versus the time NOT spent in Gallery (and beyond).
Or instead of trying to time one layer at a time, you could just leave all your timings in the
code and do the analysis of all layers at once.
Posts: 79
I'm going step by step back...
I did get the devel version of g2 going, is there anything useful in there, timing tests, or anything I can use there?
Posts: 757
> anything useful in [devel version of g2], timing tests, or anything I can use there?
Not that I know of, but I don't know for certain. If Valiant is still hanging around, perhaps
he knows?
Posts: 32509
There's just some slow query profiling in G2. See the part about $gallery->setProfile(array('sql')); in config.php.
In fact, I find what you write alarming:
So the IN () clause issue exists. If the two approach diverge starting with 6 elements in the IN predicate, then this is a problem the DB2 guys should be aware of.
Or has it to do with the total query size (number of charasters of the SQL query string)? Maybe the query buffer is too small?
Posts: 757
> IN predicate performance
I agree it is alarming, but G2 does not do separate prepare/execute... it does only db2_exec().
So yes it is an issue that I'd like to follow up on eventually, but more important right now
are things that affect G2.
Posts: 79
Ummmmmmmmmmmmmmm
You guys are gonna hate me.
I just figured it out.
Are you ready....
Are you sure.....
THis is really, really really stupid......
I feel like an XXX for wasting all of your time like this........
Not to mention countless hours of my own time...
But, hey, lets look on the bright side...
There IS a problem withe DB2 IN clause, I'll follow up on where that's happening (CLI livel or ibm_db2 level) and try to get that taken care of.
But...this performance problem....ugh.......
So, rememeber I said in my OS monitor, my hard disk was doing pretty much nothing.
Well, there's a little note there to say if you want better statics, run this command (basically just opens up a kernal extention to read more detail)
So my disk report was showing ---- (more dahses means more activity)
Well, I ran that command, and now I see RRWWWWRRWWWWW instead of dashes.
I always saw a few dashes there thinking that's just the actual retrieval of the picture. Now I'm looking at it going, why do I see WWWW, I shouldn't see writes...
So I poke around bit to see what's going on
I had my apache user's umask set to 200 (-w-------). Don't ask me how that happened, that's another story. In my data dir, the albums directory had sticky bit of 0200 (so it could reads it's own files, even though it only had write permission). The cache and smarty directorys wern't so fortunate, so every time a page was requested, even though all the cache, fast-inc files were there, it kept having to re-make them. I never saw the convert command in my top list, so I don't know about the thumbs themselvs, how it was getting those without re-converting them every time. ANyhow, I change umask to something more proper and, now, finally, my problems are solved. No more stalling. Well, at least the second time you hit a page. I hit next ,and the page loads and my thumbs come in, a 40thumb page takes about 1.2 to complete loaded. It was 1-2 seconds before, with APC turned on, just to begin loading the page.
So, well, at least we know what the problem was now.
I'm terribly sorry for taking up your time on this problem. It was driving me nuts though, I knew it couldn't be that slow.
I do very much appreciate your help. If I can help in any way to advace the project of G2,I'd be happy to offer my services.(That is if you trust my fat fingers)
And I'm almost done, with the initial phase, of porting serendipity BLOG software to DB2, another positive that came from all this. One last hurdle there, LIMIT/OFFSET which they use ALOT and DB2 has no LIMIT/OFFSET clause. (I know, it's not standard SQL, that's why it's not there, those other DB's are cheating.)
Thanks again, I think this one is licked, and it was all my fault to begin with
Whew, 60+ posts on that....brutal..
Posts: 757
Ah-haa... No problem. Like you said there are positives that came out of it. Another positive
is... it looks like you've now upgraded your entire system.
Can you still repro that IN predicate problem? If so, could you attach or post your script
here? Thanks.
Posts: 79
Gladly.
It's your script to begin with.
http://www.snyderworld.org/ftp/lm_prepexec.php.txt
It's at 6 right now. Add one more value to it and execution time jumps considerably for the prep/exec mehod.
I'm trying it in CLI now...
Posts: 79
Confirmed:
A simple compiled CLI application exhibits same performance problem.
6 items in IN clause, speeds right through, 7 items, time for a sip of tea.
Posts: 757
OK, that's still fascinating because I can't repro the difference in PHP. Would you mind
sharing a copy of the simple CLI program?
You don't happen to have any funky "PATCHn=...." settings in your "cli.ini" file, do you?
Posts: 79
Yeah..i'll post it.
I may have place the time function in not such a god place though...i'm getting inconsistent results. php program shows it, all the time.
give me a few minutes with the CLI thing here while I carefully place the time functions.
Posts: 79
OK, here's the output from my CLI program.
To be fair, the timiing for prep/exec is starting prior to SQLPrepare and ends after SQLExecute. But with 6 parameters, prep/exec is faster.
Oh, and opps, microseconds should be seconds...
A few statments at .18 seconds, and your app becomes a drag....
And here's a zip of the code
http://www.snyderworld.org/ftp/cliperf.zip
Posts: 79
Oh yeah.
I'm smokin' now.
Turned back on APC, turnd on partial accelleration..
I can't click fast enough....
Posts: 757
Yeah, but the real question is... what are you smokin'?
Sorry, I just couldn't resist.
BTW, I can't compile your CLI program... IIRC, the time functions are different on Windows (or
at least the header file for time functions is), but it shouldn't be tough for me to fix once I
get the time to do it.
Posts: 79
CPU Clock cycles....that's what I'm smokin!
I tried to pick one that was. I thought struct timeval and
gettimeofday were good for windows.
Try to put
#include "win32/time.h" instead
I took them out of php's microtime function for just that reason. They should be good.
Posts: 757
Hmm, nope, doesn't help. I'll have to hit MSDN to figure out how to use time functions.
This is about the 3rd or 4th time I've had to go to MSDN to figure this out (I have to do this about once every 5 or 6 years). Maybe I should save a little sample program locally.
Posts: 757
Ah, the header file for time functions is "WinSock2.h". That's not bloody obvious, is it?
I still can't compile it though... now I get a lot of errors about the SQLINTEGER type. I
think the root error is:
But when I take that structure definition and paste it into the existing tbread sample, it
compiles fine. So there must be something else in your tbread that is breaking it?
Posts: 79
Um, I'm not sure. I've never compiled anything on windows.
I took the definition from an examples in samples/cli directory. Poke around in there and see if they define integer varibale arrays as some other type? But I think those files should be the same as mine, they don't put different C code for different platforms, do they?
Not too much is coming on a search of that error message.
Posts: 79
I think the SQLBindParameter is having trouble with it. I had to play with the bind line a few times to get it right.
Try to change the array definition line from SQLINTEGER to SQL_C_NUMERIC