I think the best way to explain the problem is to do a little create/insert/pseudo code. If you understand what I'm doing here you'll easily see the problem, and I'm ready to hire you.
Create table table1(info1 varchar(20),info2 varchar(20),badcolumn varchar(390));
Insert into table1 values('data1','data2','ABCDEFGHIJKLMNO...[repeat for 390 chars]');
^^ Repeat for 100,000,000 rows.
Select badcolumn from table1 where info1 = 'data1' and info2 = 'data2';
> 100,000 Rows Returned
// Split badcolumn into chunks of three characters. EG: "ABC","DEF","GHI",etc.
Select count(*) from table1 where badcolumn like '%ABC%';
Select count(*) from table1 where badcolumn like '%DEF%';
Select count(*) from table1 where badcolumn like '%GHI%';
and so on.
"ABC" was found 1,434 times.
"DEF" was found 4,245 times.
"GHI" was found 42,909 times.
The obvious massive problem is the count using a wildcard - that's slow. The other problem I have is that the SQL server is remote and on a VERY slow connection so the 100,000 rows returned in the first query can take a long time.
Because there are 100,000,000 rows of badcolumn at 390 characters each, I can't reasonably split it into read-made chunks of three in the database. It would end up somewhere in the 10 billion+ rows area and many many gb. But maybe that's a viable solution?
I need someone that's capable of reworking this entire system to get the final result as efficiently as possible. I can upgrade the hardware but the network connection between the web server and the sql server is stuck on slow.
This is the sum of the problem so please suggest your solution along with your bid!
9 freelancers estão ofertando em média $174 para este trabalho
Greetings! We would love to work on optimizing your SQL Query. Please check PMB for the details of what we can do. Thank you. Best Regards, RJ Global Tech Development, Inc.