Page 1 of 2
Search conception in TRichView
Posted: Fri Mar 02, 2007 8:52 am
by Michael Pro
Well, let me explain my trouble.
I'm using TRichViewEdit for my own text processing purposes and I've found out, that text search is working only for exact items, not for some kind of their mixing.
This means, that I can write one word - for example "elephant" - and make part of it in different Text Style - for example for part "pha" I could change text size or font name - it should be a new item after text reformatting.
And this means, that word "elephant" has broken in three different parts - "ele-pha-nt". After that text search won't work for this word.
The situation I've described is common - user could write one sentence, using different styles (and it means, that he implicitly uses different items) - and the text search won't work for this simple situation. He could change styles for whole words - and the text search won't find combination for this sequences
How should I manage text searching?
Posted: Sun Mar 04, 2007 5:45 pm
by Rob
You can concatenate each item while searching and see if that matches the search text.
For example, if you need multiple items to search, keep a 'pointer' to the first item. If that item is not long enough, add the second item and match again. If this is still too shot, keep adding items until the length supersedes your search criteria. If no match, move to the next item, keep a 'pointer' to that and continue as described.
That would most probably do the trick. The code is not too hard to write.
Posted: Mon Mar 05, 2007 3:44 am
by Michael Pro
Rob wrote:You can concatenate each item while searching and see if that matches the search text.
For example, if you need multiple items to search, keep a 'pointer' to the first item. If that item is not long enough, add the second item and match again. If this is still too shot, keep adding items until the length supersedes your search criteria. If no match, move to the next item, keep a 'pointer' to that and continue as described.
That would most probably do the trick. The code is not too hard to write.
Thanks for a tip.
But here is one big question in algorithmics - for example, we've searching 15-letter string. First item have 5 letters, second item - 4 letters, third item - 7 letters. 16 letters total.
Let's suppose, that in this concatenation there is no existing matches - after it all we've to exclude first item from searching items - and make new concatenation.
It could be concatenation of second, third and fourth items - but... But the searching string could be allocated in some last characters of first item and lasts to some characters of the fourth item.
Well, it's disappointing me, but I've understood that I should implement good searching procedure for my application - it's the only way to make searching through document.
PS
As I've understood for "search-and-replace" routine I need to implement text selection for some found items and simply make text inserting?
And the other one question - if I want to search through selected text, how should I do this the best way? RichView actually searches through whole document, not part of it.
Posted: Mon Mar 05, 2007 12:25 pm
by adwilson
Did you try function
TRichView.SearchText(s: String; SrchOptions: TRVSearchOptions): Boolean; ?
Posted: Tue Mar 06, 2007 5:53 am
by Michael Pro
adwilson wrote:Did you try function
TRichView.SearchText(s: String; SrchOptions: TRVSearchOptions): Boolean; ?
Yes, of course. If I didn't use this function I didn't ask this question and didn't post long messages.
SearchText routine doesn't work for texts, allocated in different items, as I said. If we apply bold for a half of text, and the other one still stays unmodified, we won't find the whole text with this function, because normal text and bold text are allocated as different items.
Another trouble - text search is working only for whole text, not for part of it.
Posted: Tue Mar 06, 2007 12:47 pm
by Sergey Tkachenko
I am copying message of
DavidRM.
The builtin SearchText() function doesn't find words or phrases that cross style boundaries. It's also not very Unicode friendly. This is my attempt at solving both problems.
I've tested it some in my own application, and it seems to be working. It's obviously based heavily on the built in SearchText, but...different.
I'm open to comments, suggestions, and warnings of very bad choices.
Code: Select all
type
TSTUFlag=(stufFromStart,stufMatchCase,stufWholeWord);
TSTUFlags=set of TSTUFlag;
type
TSTUItem=class(TObject)
ItemNo: integer;
Length: integer;
end;
function SearchTextUnicode(rv:TCustomRichView; const searchText:widestring; flags:TSTUFlags):Boolean;
function DoSearchTextUnicodeInItem(item:TCustomRVItemInfo; storeSub:TRVStoreSubRVData;
subRVData:TCustomRVFormattedData;
const searchText:widestring):boolean; forward;
function CheckIsDelimiter(RVData:TCustomRVData; ch:widechar):boolean;
begin
Result:=RVData.IsDelimiterW(Word(ch));
end;
function DoSearchTextUnicode(RVData:TCustomRVData; const searchText:widestring;
startItemNo, startItemOfs:integer):boolean;
var
ii, styleNo, p1, p2, ofs, p1Ofs, ll: integer;
selStart, selStartOfs, selEnd, selEndOfs: integer;
paraText, itemText: widestring;
paraList: TObjectList;
paraItem: TSTUItem;
item: TCustomRVItemInfo;
storeSub: TRVStoreSubRVData;
subRVData: TCustomRVFormattedData;
chkText: widestring;
begin
Result:=false;
ii:=startItemNo;
while (ii<RVData.ItemCount) do
begin
styleNo:=RVData.GetItemStyle(ii);
if styleNo>=0 then
begin
// collect paragraph text
paraList:=TObjectList.Create;
try
itemText:=RVData.GetItemTextW(ii);
if not (stufMatchCase in flags) then
itemText:=WideUpperCase(itemText);
if startItemOfs>0 then
begin
Delete(itemText,1,startItemOfs-1);
startItemOfs:=0;
end;
paraItem:=TSTUItem.Create;
paraItem.ItemNo:=ii;
paraItem.Length:=Length(itemText);
paraList.Add(paraItem);
paraText:=itemText;
while ((ii+1)<RVData.ItemCount)
and (RVData.GetItemStyle(ii+1)>=0)
and (not RVData.IsParaStart(ii+1)) do
begin
itemText:=RVData.GetItemTextW(ii+1);
if not (stufMatchCase in flags) then
itemText:=WideUpperCase(itemText);
paraItem:=TSTUItem.Create;
paraItem.ItemNo:=ii+1;
paraItem.Length:=Length(itemText);
paraList.Add(paraItem);
paraText:=paraText+itemText;
inc(ii);
end;
p1:=Pos(searchText,paraText);
Result:=(p1>0);
if Result and (stufWholeWord in flags) then
begin
Result:=false;
chkText:=paraText;
ofs:=0;
while (p1<>0) do
begin
p1Ofs:=ofs+p1;
Result:=(p1Ofs=1) or ((p1Ofs>1) and CheckIsDelimiter(RVData,paraText[p1Ofs-1]));
if Result then
Result:=(((p1Ofs+Length(searchText))>Length(paraText))
or (((p1Ofs+Length(searchText))<=Length(paraText))
and CheckIsDelimiter(RVData,paraText[p1Ofs+Length(searchText)])));
if Result then
begin
p1:=p1Ofs;
Break;
end;
Delete(chkText,1,p1+(Length(searchText)-1));
inc(ofs,p1+(Length(searchText)-1));
p1:=Pos(searchText,chkText);
end;
end;
if Result then
begin
// select in RVData
p2:=p1+(Length(searchText)-1);
selStart:=-1;
selStartOfs:=0;
selEnd:=-1;
selEndOfs:=0;
ll:=0;
while (ll<paraList.Count) do
begin
paraItem:=TSTUItem(paraList[ll]);
if p1>paraItem.Length then
begin
dec(p1,paraItem.Length);
dec(p2,paraItem.Length);
inc(ll);
end
else
begin
selStart:=paraItem.ItemNo;
selStartOfs:=p1;
if p2<=paraItem.Length then
begin
selEnd:=selStart;
selEndOfs:=p2+1;
end
else
begin
selEnd:=selStart;
selEndOfs:=selStartOfs;
dec(p2,paraItem.Length);
inc(ll);
while (ll<paraList.Count) do
begin
paraItem:=TSTUItem(paraList[ll]);
if p2>paraItem.Length then
begin
dec(p2,paraItem.Length);
inc(ll);
end
else
begin
selEnd:=paraItem.ItemNo;
selEndOfs:=p2+1;
Break;
end;
end;
end;
Break;
end;
end;
if selStart>=0 then
TCustomRVFormattedData(rvdata).SetSelectionBounds(selStart,selStartOfs,selEnd,selEndOfs);
end;
paraList.Clear;
finally
paraList.Free;
end; // try
if Result then
Exit;
end
else
begin
item:=rvData.GetItem(ii);
storeSub:=nil;
subRVData:=TCustomRVFormattedData(item.GetSubRVData(storeSub,rvdFirst));
if subRVData<>nil then
begin
Result:=DoSearchTextUnicodeInItem(item,storeSub,subRVData,searchText);
if Result then
Exit;
end;
end;
inc(ii);
end;
end;
function DoSearchTextUnicodeInItem(item:TCustomRVItemInfo; storeSub:TRVStoreSubRVData;
subRVData:TCustomRVFormattedData;
const searchText:widestring):boolean;
begin
Result:=DoSearchTextUnicode(subRVData,searchText,0,0);
if not Result then
begin
repeat
subRVData:=TCustomRVFormattedData(item.GetSubRVData(storeSub,rvdNext));
if SubRVData=nil then
Break;
Result:=DoSearchTextUnicode(subRVData,searchText,0,0);
until Result;
end;
if Result then
item.ChooseSubRVData(storeSub);
storeSub.Free;
end;
var
useSearchText: widestring;
item:TCustomRVItemInfo;
storeSub: TRVStoreSubRVData;
subRVData: TCustomRVFormattedData;
startItemNo, startItemOfs, sl, so: integer;
begin
if not (stufMatchCase in flags) then
useSearchText:=WideUpperCase(searchText)
else
useSearchText:=searchText;
item:=nil;
storeSub:=nil;
if not (stufFromStart in flags) then
begin
item:=rv.RVData.GetChosenItem;
if item=nil then
item:=rv.RVData.PartialSelectedItem;
if item<>nil then
begin
subRVData:=TCustomRVFormattedData(item.GetSubRVData(storeSub,rvdChosenDown));
if SubRVData<>nil then
begin
Result:=DoSearchTextUnicodeInItem(item,storeSub,subRVData,useSearchText);
if Result then
Exit;
end;
end;
end;
if (stufFromStart in flags) then
begin
startItemNo:=0;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
end
else
begin
if rv.RVData.SelectionExists(true,false) then
rv.RVData.GetSelBounds(sl,startItemNo,so,startItemOfs,true)
else
begin
startItemNo:=rv.RVData.GetFirstItemVisible;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
end;
if startItemNo<0 then
startItemNo:=0;
end;
if startItemNo>rv.RVData.Items.Count then
Exit;
if rv.RVData.Items.Objects[startItemNo]=item then
begin
inc(startItemNo);
if startItemNo>rv.RVData.Items.Count-1 then
Exit;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
end;
Result:=DoSearchTextUnicode(rv.RVData,useSearchText,startItemNo,startItemOfs);
end;
-David
Edit: Fixed issue with repeated function calls.
Posted: Wed Mar 07, 2007 4:33 am
by Michael Pro
Sergey, thanks a lot for this post - I'll test this code immidiately.
As I said, I've got another one question - what is the best way for making search in the selected text? Should I create another one Richview object and place the selected text there, or is where more elegant algorithm exists?
Maybe it's better to simply check ranges of selection without creating another one Richview object?
PS I've examined this code and understood the best way to make search in selected text
If this code works fine, it'll be great
Thanks once more
Posted: Wed Mar 07, 2007 6:06 am
by Sergey Tkachenko
You can place caret at the beginning of the selection, the search will start from there. If the found word will be beyond the other bound of the original selection, return the selection back to the previous state (use functions from RVLinear to store and restore selection)
Posted: Wed Mar 07, 2007 10:06 am
by Michael Pro
Sergey Tkachenko wrote:You can place caret at the beginning of the selection, the search will start from there. If the found word will be beyond the other bound of the original selection, return the selection back to the previous state (use functions from RVLinear to store and restore selection)
I'm so grateful for this help. In first view everything works OK. I didn't thought, that I could implement good searching in such easy way. Especially I want to thank DavidRM for contributing great searching routines.
By the way, I've found that SearchTextUnicode didn't always make right selection in RichView Edit (sometimes selection isn't highlighted) - but I've solved this problem with simple applying of Refomat procedure - and after that everything works fine.
Posted: Wed Mar 07, 2007 2:17 pm
by Sergey Tkachenko
Reformat is an overkill here. Use Invalidate.
Posted: Fri Mar 09, 2007 4:13 am
by Michael Pro
Sergey Tkachenko wrote:Reformat is an overkill here. Use Invalidate.
ОК, rewrote this part.
I've got another one advice for this code - actually, I didn't find implementation for directional search, but I've done it directly in my processing procedures. If you gonna implement this contributed procedure for text searching in futher versions of TRichView, it could be useful to implement all the search options from Delphi search dialogs for example (and it'll be more MS Word like search conceptions).
This means, that the full list of options in ideal looks like:
a) Case sensitive option (implemented in this code)
b) Match case (implemented in this code)
c) Searching in entire scope/from cursor (partially implemented)
d) Searching in selected text/global search (not implemented - an due to this c) state won't work correctly in all cases - I've to make some adjustments to the contributed code and fix it in my inherited class - it didn't look so nice in object-oriented view
)
e) Direction of search - it's Achilles' heel of this code (not implemented). Without code corrections, I've done it with cycling through desire part of searching and keeping information of last found item. If the search routine couldn't find the right text, I'm simply restoring the last backup information. It's not looking nice for implementing "Find Next" method, but another way for making this functionality is to rewrite the DavidRM contributed procedure. This means that in the most bad situation it'll work with O(n!) complexity (but the sequent search has O(n) complexity
). Of course, in future I'll optimize my code and fix this slow part for searching, but not for now.
Despite all this disadvantages I've understood that another editors/viewers are far behind from your components, and if this functionality, described above, could be implemented in further versions, it would be outstanding.
With best regards
Posted: Fri Mar 09, 2007 4:39 am
by Michael Pro
Big Bug in this code.
For example we are searching word "Search" in following text:
Search text Search.
First time procedure works fine and selects the first "Search", but if we're applying search again, selection goes wrong.
Why? Because deleting the first found selection from a part of searching text affects on calculation of offsets for futher searching - and the procedure will find wrong offset (it'll mark " text " because it deletes word Search from the first sentence and starts offset from zero).
Posted: Fri Mar 09, 2007 5:25 am
by Michael Pro
Well, I've found this bug. It's related to wrong offset calculation if another search text part is located in the same item
Code: Select all
function DoSearchTextUnicode(RVData:TCustomRVData; const searchText:widestring;
startItemNo, startItemOfs:integer):boolean;
var
ii, styleNo, p1, p2, ofs, p1Ofs, ll: integer;
selStart, selStartOfs, selEnd, selEndOfs: integer;
paraText, itemText: widestring;
paraList: TObjectList;
paraItem: TSTUItem;
item: TCustomRVItemInfo;
storeSub: TRVStoreSubRVData;
subRVData: TCustomRVFormattedData;
chkText: widestring;
Deleted: Boolean;
OldOffset: integer;
OldItem: Integer;
begin
Result:=false;
Deleted := false;
ii:=startItemNo;
OldItem := startItemNo;
while (ii<RVData.ItemCount) do
begin
styleNo:=RVData.GetItemStyle(ii);
if styleNo>=0 then
begin
// collect paragraph text
paraList:=TObjectList.Create;
try
itemText:=RVData.GetItemTextW(ii);
if not (stufMatchCase in flags) then
itemText:=WideUpperCase(itemText);
if startItemOfs>0 then
begin
Delete(itemText,1,startItemOfs-1);
//!!!!
//Check here
if ii = OldItem then
begin
Deleted := true;
OldOffset := startItemOfs - 1;
end;
startItemOfs:=0;
end;
paraItem:=TSTUItem.Create;
paraItem.ItemNo:=ii;
paraItem.Length:=Length(itemText);
paraList.Add(paraItem);
paraText:=itemText;
while ((ii+1)<RVData.ItemCount)
and (RVData.GetItemStyle(ii+1)>=0)
and (not RVData.IsParaStart(ii+1)) do
begin
itemText:=RVData.GetItemTextW(ii+1);
if not (stufMatchCase in flags) then
itemText:=WideUpperCase(itemText);
paraItem:=TSTUItem.Create;
paraItem.ItemNo:=ii+1;
paraItem.Length:=Length(itemText);
paraList.Add(paraItem);
paraText:=paraText+itemText;
inc(ii);
end;
p1:=Pos(searchText,paraText);
Result:=(p1>0);
if Result and (stufWholeWord in flags) then
begin
Result:=false;
chkText:=paraText;
ofs:=0;
while (p1<>0) do
begin
p1Ofs:=ofs+p1;
Result:=(p1Ofs=1) or ((p1Ofs>1) and CheckIsDelimiter(RVData,paraText[p1Ofs-1]));
if Result then
Result:=(((p1Ofs+Length(searchText))>Length(paraText))
or (((p1Ofs+Length(searchText))<=Length(paraText))
and CheckIsDelimiter(RVData,paraText[p1Ofs+Length(searchText)])));
if Result then
begin
p1:=p1Ofs;
Break;
end;
Delete(chkText,1,p1+(Length(searchText)-1));
inc(ofs,p1+(Length(searchText)-1));
p1:=Pos(searchText,chkText);
end;
end;
if Result then
begin
// select in RVData
p2:=p1+(Length(searchText)-1);
selStart:=-1;
selStartOfs:=0;
selEnd:=-1;
selEndOfs:=0;
ll:=0;
while (ll<paraList.Count) do
begin
paraItem:=TSTUItem(paraList[ll]);
if p1>paraItem.Length then
begin
dec(p1,paraItem.Length);
dec(p2,paraItem.Length);
inc(ll);
end
else
begin
selStart:=paraItem.ItemNo;
selStartOfs:=p1;
if p2<=paraItem.Length then
begin
selEnd:=selStart;
selEndOfs:=p2+1;
end
else
begin
selEnd:=selStart;
selEndOfs:=selStartOfs;
dec(p2,paraItem.Length);
inc(ll);
while (ll<paraList.Count) do
begin
paraItem:=TSTUItem(paraList[ll]);
if p2>paraItem.Length then
begin
dec(p2,paraItem.Length);
inc(ll);
end
else
begin
selEnd:=paraItem.ItemNo;
selEndOfs:=p2+1;
Break;
end;
end;
end;
Break;
end;
end;
if selStart>=0 then
begin
//!!!!
//Check here
if Deleted and (OldItem = selStart) then
begin
if selStart = selEnd then
TCustomRVFormattedData(rvdata).SetSelectionBounds(selStart,
selStartOfs + OldOffset, selEnd, selEndOfs + OldOffset)
else
TCustomRVFormattedData(rvdata).SetSelectionBounds(selStart,
selStartOfs + OldOffset, selEnd, selEndOfs);
end
else
TCustomRVFormattedData(rvdata).SetSelectionBounds(selStart,selStartOfs,selEnd,selEndOfs);
end;
end;
paraList.Clear;
finally
paraList.Free;
end; // try
if Result then
Exit;
end
else
begin
item:=rvData.GetItem(ii);
storeSub:=nil;
subRVData:=TCustomRVFormattedData(item.GetSubRVData(storeSub,rvdFirst));
if subRVData<>nil then
begin
Result:=DoSearchTextUnicodeInItem(item,storeSub,subRVData,searchText);
if Result then
Exit;
end;
end;
inc(ii);
end;
end;
I've marked changes with
It could be useful for those, who could apply this routine in their projects - but it should be properly tested - actually, I could add another bug with my changes
Posted: Fri Mar 09, 2007 5:33 am
by Michael Pro
In part, there we making text selection, we need to check, if the search text was found in the start item or somewhere else.
If it's allocated in the start item, then we have to modify offsets - and check the following:
a) If the search text whole allocated in the start item - need to modify start AND end offsets - it have been calculated wrong;
b) If only a part of text located in start item - then need to modify only a start offset, NOT the last one - because the last one have been calculated right.
Posted: Fri Mar 09, 2007 8:33 am
by Michael Pro
Another found trouble is an endless cycle, while implementing something like "Replace All" method. If we call this procedure for replacing A for AA - it'll generate an endless changes.
Fixed variant here
Code: Select all
var
useSearchText: widestring;
item:TCustomRVItemInfo;
storeSub: TRVStoreSubRVData;
subRVData: TCustomRVFormattedData;
startItemNo, startItemOfs, sl, so: integer;
begin
if not (stufMatchCase in flags) then
useSearchText:=WideUpperCase(searchText)
else
useSearchText:=searchText;
item:=nil;
storeSub:=nil;
if not (stufFromStart in flags) then
begin
item:=rv.RVData.GetChosenItem;
if item=nil then
item:=rv.RVData.PartialSelectedItem;
if item<>nil then
begin
subRVData:=TCustomRVFormattedData(item.GetSubRVData(storeSub,rvdChosenDown));
if SubRVData<>nil then
begin
Result:=DoSearchTextUnicodeInItem(item,storeSub,subRVData,useSearchText);
if Result then
Exit;
end;
end;
end;
if (stufFromStart in flags) then
begin
startItemNo:=0;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
end
else
begin
if rv.RVData.SelectionExists(true,false) then
rv.RVData.GetSelBounds(sl,startItemNo,so,startItemOfs,true)
else
begin
{!!!!}
{Check here}
//startItemNo:=rv.RVData.GetFirstItemVisible;
//startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
startItemNo := CurItemNo;
startItemOfs := OffsetInCurItem;
end;
if startItemNo<0 then
startItemNo:=0;
end;
if startItemNo>rv.RVData.Items.Count then
Exit;
if rv.RVData.Items.Objects[startItemNo]=item then
begin
inc(startItemNo);
if startItemNo>rv.RVData.Items.Count-1 then
Exit;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
end;
Result:=DoSearchTextUnicode(rv.RVData,useSearchText,startItemNo,startItemOfs);
As you see, trouble is here
Code: Select all
startItemNo:=rv.RVData.GetFirstItemVisible;
startItemOfs:=rv.RVData.GetOffsBeforeItem(startItemNo);
I've changed it for
Code: Select all
startItemNo := CurItemNo;
startItemOfs := OffsetInCurItem;
And it seems to work fine.
But I understand, that I using search routine for text only (I didn't make good graphics support in my inherited Editor), and so, correctness of this code is needed to be examined and tested.