There are 3 tables in SQL Server with large amount of data, each table contains about 100000
rows. There is one SQL to fetch rows from the three tables. Its performance is very bad.
WITH t1 AS
(
SELECT
LeadId, dbo.get_item_id(Log) AS ItemId, DateCreated AS PriceDate
FROM
(SELECT
t.ID, t.LeadID, t.Log, t.DateCreated, f.AskingPrice
FROM
t
JOIN
f ON f.PKID = t.LeadID
WHERE
t.Log LIKE '%xxx%') temp
)
SELECT COUNT(1)
FROM t1
JOIN s ON s.ItemID = t1.ItemId
When checking its estimated execution plan, I find it uses a nested loop join with large rows. Loot at the screenshot below. The top part in the image return 124277 rows, and the bottom part is executed 124277 times! I guess this is why it is so slow.
We know that nested loop has big performance issue with large data. How to remove it, and use hash join or other join instead?
Edit: Below is the related function.
CREATE FUNCTION [dbo].[get_item_Id](@message VARCHAR(200))
RETURNS VARCHAR(200) AS
BEGIN
DECLARE @result VARCHAR(200),
@index int
--Sold in eBay (372827580038).
SELECT @index = PatIndex('%([0-9]%)%', @message)
IF(@index = 0)
SELECT @result='';
ELSE
SELECT @result= REPLACE(REPLACE(REPLACE(SUBSTRING(@message, PatIndex('%([0-9]%)%', @message),8000), '.', ''),'(',''),')','')
-- Return the result of the function
RETURN @result
END;