Jumat, 19 Juli 2013

Re: [ExcelVBA] Re: Transposition errors

Dear All:
Thank you so much!
The macro works beautifully and it solves my problem.

Keep up the good work!

Regards, Saad


On Wed, Jul 17, 2013 at 1:11 PM, Derek Turner <g4swy@yahoo.com> wrote:

> **
>
>
> Dear Steve
>
> In order to understand Paul's code I reworked it in my own
> self-documenting style. :-
>
> Sub Find_Transpose()
> Dim nRowCount As Long, nCurrentRow As Long, nPosition As Long
> Dim sItemNumber As String, sItemNumberWithTransposition As String
> Dim bAddFlag As Boolean, nItemLength As Long, sPairLeft As String,
> sPairRight As String
> '
> Sheets("Data").Activate
> nRowCount = Application.WorksheetFunction.CountA(Range("A1:A65000"))
> Range("b2:b" & nRowCount).ClearContents
> With CreateObject("Scripting.Dictionary")
> .RemoveAll
> For nCurrentRow = 2 To nRowCount
> sItemNumber = Cells(nCurrentRow, "a").Value
> nItemLength = Len(sItemNumber)
> bAddFlag = True
> If Not .exists(sItemNumber) Then
> If nItemLength > 2 Then
>
> For nPosition = 1 To nItemLength - 1
> sPairLeft = Mid(sItemNumber, nPosition, 1)
> sPairRight = Mid(sItemNumber, nPosition + 1, 1)
> If sPairLeft <> sPairRight Then
> sItemNumberWithTransposition = sItemNumber
> Mid(sItemNumberWithTransposition, nPosition,
> 2) = sPairRight & sPairLeft
> If .exists(sItemNumberWithTransposition) Then
> WriteToCell Cells(nCurrentRow, "b"),
> sItemNumberWithTransposition
> WriteToCell
> Cells(.Item(sItemNumberWithTransposition), "b"), sItemNumber
> bAddFlag = False
>
> Exit For
> End If
> End If
> Next nPosition
> If bAddFlag Then
> .Add sItemNumber, nCurrentRow
> End If
> End If
> End If
> Next nCurrentRow
> End With ' CreateObject("Scripting.Dictionary")
> MsgBox "Finished"
> End Sub
> Sub WriteToCell(rCell As Range, sValue As String)
> If IsEmpty(rCell) Then
> rCell.Value = sValue
> Else ' slash here instead of comma defeats cell comma format
> rCell.Value = rCell.Value & "/" & sValue
> End If
> End Sub
>
> Notice that there are no compares, only dictionary lookups.
>
> By the way if Yahoo removes indents, a good trick is to paste the code
> into VB.Net, then copy and paste back into Excel.
> A few brackets get added but it;s no big deal to remove them.
>
> Regards
>
> Derek Turner ++
>
>
> >________________________________
> > From: noskosteve <noskosteve@yahoo.com>
> >To: ExcelVBA@yahoogroups.com
> >Sent: Wednesday, 17 July 2013, 2:48
>
> >Subject: [ExcelVBA] Re: Transposition errors
> >
> >
> >
> >
> >
> >
> >Sorry if this is a re-post.
> >Yahoo glitch...getting more and more common.(:-(
> >
> >Ugh! I am really sorry, Paul. I didn't intend for you to go to such
> detail on your *whole* algo. I understood the vast majority of it before. I
> only was after a 20,000 foot version (like what follows) of the "Else" you
> mentioned that you added after my comment. I thought it was only that part
> which addressed a loophole you saw in my algo.
> >
> >You are swapping *all* potential transposition pairs (6) in every "new"
> string and comparing all six of these strings with every dictionary entry.
> >
> >I was only comparing individual characters in both strings (the "new one"
> and each dictionary entry) and counting how many consecutive characters
> differ. I didn't have to check all characters in every string.
> >
> >Comparing individual characters allows you to eliminate many un-necessary
> compares because you have already proven it can't be a transposition.
> >
> >Here's the thought process:
> >
> >When you get three non-matching characters in a row, quit this dictionary
> entry and go to the next dictionary entry. Don't keep comparing to that
> dictionary entry - three consecutive non-matches means it cannot be a
> transposition.
> >
> >ONLY if you finish comparing with a dictionary entry and get only two
> consecutive non-matches, is it a potential transposition - mark it as a
> potential transposition. It is only for this last case where you have to
> compare all characters in the strings - thus you save many compares.
> >Here you should go to the next dictionary entry because you can have more
> than one two-character-swap matches
> >
> >When you finish comparing a "new" string with the whole dictionary and do
> not get any two-character hits (which are potential transpositions), then
> it's not a potential transposition. Add this "new" string to the
> dictionary.
> >
> >When you finish the dictionary, loop to the next "new" string.
> >
> >That is where I was going. You can determine that a transposition is
> impossible before comparing the whole string, so stop comparing and move on.
> >
> >I also noted that if these (P.O.) strings were assigned consecutively, I
> would do the comparisons from right to left, so I wasn't comparing mostly
> same characters. This would get to the three character guaranteed
> mis-matches in the string sooner for all non-transposed strings.
> >
> >This only flags potential transpositions that must be human-resolved. DO
> you see a loophole in that?
> >
> >Steve
> >
> >--- In ExcelVBA@yahoogroups.com, Paul Schreiner <schreiner_paul@...>
> wrote:
> >>
> >> I know, right?Â
> >> when I wrote it, my VB Editor has it indented and looks clean and neat.
> >> Then, when I posted it, it stripped off the indents and makes it look
> messy!
> >> Â
> >> I'll try to describe my logic.
> >> I'll also skip descriptions of loops.
> >> Â
> >> Problem:
> >> He has a list of 7-digit numbers
> >> (I wasn't sure if they were strictly numeric or alpha-numeric, so I
> chose to treat them as 7 character strings)
> >> Â
> >> Being manually entered, there are MANY types of keying errors.
> >> For this project, he is only interested in looking for adjacent
> transposed digits that result in a number identical to a previously entered
> number.
> >> Â
> >> Solution:
> >> Â
> >> There is no way to determine which of the two numbers is correct:
> >> The first could have transposed digits making it match the second, or
> the reverse is true.
> >> So, I chose to "flag" both records for later evaluation.
> >> I decided that simply "flagging" the records doesn't indicate which
> digits or records caused the match, so I elected to use the matching record
> number as the "flag".
> >> Â
> >> I used a Dictionary Object to build a very fast index of the numbers.
> >> I also assumed a header row, so the data begins in row 2.
> >> I load the first number into the dictionary and save the row number of
> the entry (2).
> >> I then loop through the rest of the data:
> >> For R = 3 to nRows
> >> Â
> >> I load the cell value into the variable "ItemNo":
> >> ItemNo = Sheets(sht).Cells(R, "A").Value
> >>
> >> I then checked to see if this Item Number has already been used:
> >> If (Not Dict_Docs.exists(ItemNo)) Then
> >>
> >> Here is where an unanswered question was encountered.
> >> Is each record supposed to be unique?
> >> That is, can there be multiple entries with the same record number?
> >> If an exact match is allowed, then we would skip the record.
> >> If each record is supposed to be unique, then we should an an ELSE
> construct that will "flag" both this record and the previous matching
> record.
> >>
> >> I chose to "skip" the matching records.
> >>
> >> Next, the intent was to swap each of the pairs of numbers in the string
> and look to see if it results in a match to previously loaded numbers.
> >> for a number like: 1234567
> >> you can either start with the 1 and proceed to 6 (one less than the
> length),
> >> swapping each digit with the following digit,
> >> or start with the second digit to the last, swapping with the previous
> digit.
> >>
> >> I chose the latter:
> >> for inx = 2 to len(ItemNo)
> >>
> >>
> >> I then tested to two digits being swapped to see if they were the same.
> >>
> >> If (Mid(ItemNo, inx, 1) <> Mid(ItemNo, inx - 1, 1)) Then
> >>
> >> if you have a number like 1233567 and swap the third and fourth digit,
> >> you end up with the same number! which would be the same as comparing
> before swapping, so I skipped this iteration of the loop.
> >>
> >> Next, I needed to build a string with swapped digits (ItemNo2).
> >>
> >> for a number like 1234567, for the iteration where inx = 4,
> >> I'm swapping the 3rd and 4th digit.
> >> So the string looks like:
> >> "digits 1-2" & "4th digit" & "3rd digit" & "digits 5-7"
> >>
> >> Now, when you're working with a new number, the first character swap is
> the first two digits, so there's NOTHING to the left of the swapped digits.
> >>
> >> For this case (inx = 2) I set ItemNo2 = "", otherwise, I set it equal
> to the string prior to the swapped digits:
> >> ItemNo2 = Left(ItemNo, inx - 2)
> >>
> >> I then add the second of the swapped pair:
> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx, 1)
> >>
> >> Add the first of the swapped pair:
> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx - 1, 1)
> >>
> >> and then the balance of the string:
> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx + 1, Len(ItemNo) - inx)
> >>
> >> I then check to see if this newly generated number exists in the
> Dictionary.
> >> If it DOES NOT, then I proceed to check the rest of the adjacent pairs
> in the string.
> >> If at any time a matching string is found:
> >> Â I add the string with the transposed digits to a column in the row
> with the record being processed.
> >> I then retrieve the matched record from the dictionary. The
> Dictionary has the row number of the record.
> >> So in the row of the matched record, I put THIS record.
> >>
> >> Now... it's possible if someone transposes a number once, they could do
> it again.
> >> So, instead of overwriting the flag cell, I "concatenate" the contents
> with the current record number (using a ", " between records in the string)
> >>
> >> Now, before I check the record number, I set an AddFlag variable to
> "true".
> >> If the character swapping procedure encounters a match, this flag is
> set to "false".
> >>
> >> If, after checking the transposition of all characters, no matching
> record is found, then this new record is added to the Dictionary and move
> onto the next record.
> >>
> >> -----------------------------
> >> hope this helps
> >>
> >>
> >>
> >>
> >>
> >> Paul
> >> -----------------------------------------
> >> “Do all the good you can,
> >> By all the means you can,
> >> In all the ways you can,
> >> In all the places you can,
> >> At all the times you can,
> >> To all the people you can,
> >> As long as ever you can.†- John Wesley
> >> -----------------------------------------
> >>
> >>
> >> >________________________________
> >> >From: noskosteve <noskosteve@...>
> >> >To: ExcelVBA@yahoogroups.com
> >> >Sent: Tuesday, July 16, 2013 12:25 PM
> >> >Subject: [ExcelVBA] Re: Transposition errors
> >> >
> >> >Â
> >> >
> >> >
> >> >Sorry Paul, but with all the clutter added by the paste and the code
> itself, I can't pull out the essence of your solution. For my info, could
> you vive a verbal summary of the basis of the algo...?
> >> >--
> >> >Steve
> >> >--- Paul Schreiner <schreiner_paul@> wrote:
> >> >>
> >> >> Steve N.
> >> >> Â
> >> >> You're right, I didn't provide for a duplicated number.
> >> >> But that's easy enough to fix.
> >> >> when checking to see if the Dictionary Item exists:
> >> >> If (Not Dict_Docs.exists(ItemNo)) then
> >> >> Â
> >> >> simply add an "else" section:
> >> >> Â
> >> >> Else
> >> >> Â Â Â If (Sheets(sht).Cells(R, "B").Value & "X" = "X") Then
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(R, "B").Value = ItemNo2
> >> >> Â Â Â Else
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(R, "B").Value = _
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(R, "B").Value & "," & ItemNo2
> >> >> Â Â Â End If
> >> >> Â Â Â If (Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value &
> "X" = "X") Then
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value
> = ItemNo
> >> >> Â Â Â Else
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value
> = _
> >> >> Â Â Â Â Â Â Â Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value
> & "," & ItemNo
> >> >> Â Â Â End If
> >> >> End IF
> >> >>
> >> >> As for speed...
> >> >> I tested with 10,000 records and it executed in less than two
> seconds.
> >> >> Spending a lot of time to reduce it to one second seems a bit...
> obsessive? ;)
> >> >>
> >> >>
> >> >> thanks for the review tough...
> >> >> Â
> >> >> Paul
> >> >> -----------------------------------------
> >> >> “Do all the good you can,
> >> >> By all the means you can,
> >> >> In all the ways you can,
> >> >> In all the places you can,
> >> >> At all the times you can,
> >> >> To all the people you can,
> >> >> As long as ever you can.†- John Wesley
> >> >> -----------------------------------------
> >> >>
> >> >>
> >> >> >________________________________
> >> >> >From: noskosteve <noskosteve@>
> >> >> >To: mailto:ExcelVBA%40yahoogroups.com
> >> >> >Sent: Monday, July 15, 2013 10:32 PM
> >> >> >Subject: [ExcelVBA] Re: Transposition errors
> >> >> >
> >> >> >Â
> >> >> >
> >> >> >Hmmm. This is indeed a challenge.
> >> >> >
> >> >> >Paul S's method appears to provide for lots of compares - 6 for
> every two entries as well as the transposition code cost. How about looking
> character-by-character, then counting the number of unequal characters.
> Only when you get two do you check for a transposition and the rest of the
> string. You stop comparing that pair when you have the third unequal
> character....
> >> >> >
> >> >> >While your method has VBA doing the full string compares and this
> is a Macro loop, yours compares each pair 6 times, guaranteed, plus 6
> transpositions.
> >> >> >
> >> >> >This method does the majority compares of two entries only up to
> three characters. This would be skewed if many of the numbers were assigned
> in sequence. If that was the case, you can do the comparason from the
> right. This would significanty shorten how far you went "into" a majority
> of comparisons.
> >> >> >Perhaps this it too concerned with speed...?
> >> >> >
> >> >> >......
> >> >> >Thoughts and things to explore:
> >> >> >If these numbers are assigned by another department/function and
> simply entered by operators, it might be possible to request that a list of
> valid numbers be provided along with the data.
> >> >> >
> >> >> >......
> >> >> >With the stated constraints, there are several known opportunities
> for false positives and false negatives in the test, i.e. ...
> >> >> >
> >> >> >Saad says:
> >> >> >
> >> >> >A list comprises hundreds of entries.
> >> >> >
> >> >> >"The list usually contains duplicate values" ... The assumption
> being that at least one is error-free.
> >> >> >and
> >> >> >"...there is no control list which I can compare my data with. "
> ... Internal comparison 'appears' necessary.
> >> >> >and
> >> >> >"We have 2-3 instances where this problem arises" ... It is
> relatively rare.
> >> >> >
> >> >> >Implications:
> >> >> >A transposition error is relatively rare; 2-3 out of hundreds.
> >> >> >
> >> >> >Doing this 'internal comparison' depends on a number appearing at
> least twice in the list to catch the error.
> >> >> >However, If an operator does the same transposition twice (seems
> somewhat likely), this method won't catch it. It will appear to be a valid
> number, though it isn't - false negative.
> >> >> >
> >> >> >If the number is in the list only once, it can never be flagged as
> a possible transposition - possible false negative.
> >> >> >
> >> >> >If two *valid* numbers are possible that happen to have this type
> of transposition, they may, or may not be flagged (depending on how many
> times they appear in the list). For example, if the date is in the P.O.
> number, a P.O. done on the 12th and another done on the 21st would be
> flagged as a transposition *unless* they show up more than once each.
> Potential false positive or false negative.
> >> >> >
> >> >> >Just some thoughts.
> >> >> >--
> >> >> >Steve N.
> >> >> >
> >> >> >--- "paulschreinerindy" wrote:
> >> >> >>
> >> >> >> Saad,
> >> >> >>
> >> >> >> What I would do (and have done in the past) is utilize a
> Dictionary object.
> >> >> >> I would insert the first item into the Dictionary.
> >> >> >> Then, read the second item and see if it exists in the dictionary.
> >> >> >> If it does not, then:
> >> >> >> transpose each of the adjacent pairs of characters and look in
> the dictionary for the resulting string.
> >> >> >> If the string does not exist, add the original string to the
> dictionary and proceed.
> >> >> >>
> >> >> >> Keep in mind that this will ONLY check the transposition of two
> adjacent characters!
> >> >> >>
> >> >> >> I put the following macro together.
> >> >> >> I created a list of 10,000 7-digit "random" numbers, and the
> macro ran in about 2 seconds in Excel 2010.
> >> >> >>
> >> >> >> It ASSUMES that your numbers are in column "A".
> >> >> >> It puts the "original" value in column "B" for any suspected
> duplicates.
> >> >> >> ------------------------------------------------
> >> >> >>
> >> >> >> Option Explicit
> >> >> >>
> >> >> >> Sub Find_Transpose()
> >> >> >> Dim Dict_Docs
> >> >> >> Dim nRows As Integer, R As Integer, inx As Integer, sht As String
> >> >> >> Dim ItemNo As String
> >> >> >> Dim ItemNo2
> >> >> >> Dim AddFlag As Boolean
> >> >> >>
> >> >> >>
> >> >> >> ThisWorkbook.Activate
> >> >> >> sht = "Data"
> >> >> >> Set Dict_Docs = CreateObject("Scripting.Dictionary")
> >> >> >> Dict_Docs.RemoveAll
> >> >> >> nRows =
> Application.WorksheetFunction.CountA(Sheets(sht).Range("A1:A65000"))
> >> >> >> Sheets(sht).Range("B2:B" & nRows).ClearContents
> >> >> >> 'Add first item to Dictionary
> >> >> >> Dict_Docs.Add Sheets(sht).Cells(2, "A").Value, 2
> >> >> >> ' Loop Through remaining items
> >> >> >> For R = 3 To nRows
> >> >> >> ItemNo = Sheets(sht).Cells(R, "A").Value
> >> >> >> AddFlag = True
> >> >> >> If (Not Dict_Docs.exists(ItemNo)) Then
> >> >> >> If (Len(ItemNo) > 2) Then
> >> >> >> For inx = 2 To Len(ItemNo)
> >> >> >> If (Mid(ItemNo, inx, 1) <> Mid(ItemNo, inx - 1, 1)) Then
> >> >> >> If (inx = 2) Then
> >> >> >> ItemNo2 = ""
> >> >> >> Else
> >> >> >> ItemNo2 = Left(ItemNo, inx - 2)
> >> >> >> End If
> >> >> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx, 1)
> >> >> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx - 1, 1)
> >> >> >> ItemNo2 = ItemNo2 & Mid(ItemNo, inx + 1, Len(ItemNo) - inx)
> >> >> >> If (Dict_Docs.exists(ItemNo2)) Then
> >> >> >> If (Sheets(sht).Cells(R, "B").Value & "X" = "X") Then
> >> >> >> Sheets(sht).Cells(R, "B").Value = ItemNo2
> >> >> >> Else
> >> >> >> Sheets(sht).Cells(R, "B").Value = Sheets(sht).Cells(R, "B").Value
> & "," & ItemNo2
> >> >> >> End If
> >> >> >> If (Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value & "X" =
> "X") Then
> >> >> >> Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value = ItemNo
> >> >> >> Else
> >> >> >> Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value =
> Sheets(sht).Cells(Dict_Docs.Item(ItemNo2), "B").Value & "," & ItemNo
> >> >> >> End If
> >> >> >> AddFlag = False
> >> >> >> Exit For
> >> >> >> End If
> >> >> >> End If
> >> >> >> Next inx
> >> >> >> If (AddFlag) Then Dict_Docs.Add ItemNo, R
> >> >> >> End If
> >> >> >> End If
> >> >> >> Next R
> >> >> >> MsgBox "Finished"
> >> >> >> End Sub
> >> >> >>
> >> >> >> --------------------------------------------------
> >> >> >> Paul
> >> >> >>
> >> >> >> -----------------------------------------
> >> >> >> "Do all the good you can,
> >> >> >> By all the means you can,
> >> >> >> In all the ways you can,
> >> >> >> In all the places you can,
> >> >> >> At all the times you can,
> >> >> >> To all the people you can,
> >> >> >> As long as ever you can." - John Wesley
> >> >> >> -----------------------------------------
> >> >> >>
> >> >> >>
> >> >> >> --- In mailto:ExcelVBA%40yahoogroups.com, Saad Usman
> <saad.saadusman@> wrote:
> >> >> >> >
> >> >> >> >
> >> >> >> > > Dear Excel Wizards:
> >> >> >> > >
> >> >> >> > > I have tried to google my issue, but have not been able to
> find a solution so far.
> >> >> >> > >
> >> >> >> > >
> >> >> >> > >
> >> >> >> > > The problem is that I receive data in the form of a list of
> hundreds of records where entries in one particular column (say part
> number) comprise of 7 digits. The list usually contains duplicate values,
> which I have been able to identify using conditional formatting. However
> some times, the data entry operator makes transposition errors (for e.g.
> feeding 1243567 instead of 1234567). I need to identify those cells where
> at least 5 digits are identical so that I can manually decide whether there
> is an error in data entry.
> >> >> >> > >
> >> >> >> > > Any help via formulas/macro would be greatly welcomed.
> >> >> >> > >
> >> >> >> > >
> >> >> >> > Regards, Saad
> >> >> >> >
> >> >> >> > [Non-text portions of this message have been removed]
> >> >> >> >
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >> [Non-text portions of this message have been removed]
> >> >>
> >> >
> >> >
> >> >
> >>
> >> [Non-text portions of this message have been removed]
> >>
> >
> >
> >
> >
> >
>
> [Non-text portions of this message have been removed]
>
>
>


[Non-text portions of this message have been removed]



------------------------------------

----------------------------------
Be sure to check out TechTrax Ezine for many, free Excel VBA articles! Go here: http://www.mousetrax.com/techtrax to enter the ezine, then search the ARCHIVES for EXCEL VBA.

----------------------------------
Visit our ExcelVBA group home page for more info and support files:
http://groups.yahoo.com/group/ExcelVBA

----------------------------------
More free tutorials and resources available at:
http://www.mousetrax.com

----------------------------------Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/ExcelVBA/

<*> Your email settings:
Individual Email | Traditional

<*> To change settings online go to:
http://groups.yahoo.com/group/ExcelVBA/join
(Yahoo! ID required)

<*> To change settings via email:
ExcelVBA-digest@yahoogroups.com
ExcelVBA-fullfeatured@yahoogroups.com

<*> To unsubscribe from this group, send an email to:
ExcelVBA-unsubscribe@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Tidak ada komentar:

Posting Komentar