Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

How to quickly and in bulk replace the text according to the reference list with formulas – we have already sorted it out. Now let’s try to do it in Power Query.

As often happens birêvebirin this task is much easier than explaining çima it works, but let’s try to do both 🙂

So, we have two “smart” dynamic tables created from ordinary ranges with a keyboard shortcut Ctrl+T an tîm Mal - Wek tabloyê format bikin (Mal - Format wek Tablo):

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

I called the first table Jimare, the second table – Directorykaranîna zeviyê Navê maseyê (Table name) tab Kêrker (Mînakkirin).

Task: replace in addresses in the table Jimare all occurrences from a column Ji bo dîtina Destûr to their corresponding correct counterparts from the column Diberdasitin. The rest of the text in the cells should remain untouched.

Step 1. Load the directory into Power Query and turn it into a list

Having set the active cell to any place in the reference table, click on the tab Jimare (Rojek)an li ser tabê Pirsa Hêzê (if you have an old version of Excel and you installed Power Query as an add-in on a separate tab) on the button Ji sifrê / rêz (Ji sifrê/Range).

The reference table will be loaded into the Power Query query editor:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

In order not to interfere, an automatically added step cureyê guherî (Tîpa guherî) in the right panel, the applied steps can be safely deleted, leaving only the step Kanî (Kanî):

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Now, to perform further transformations and replacements, we need to turn this table into a list (list).

Dûrketina lîrîk

Before continuing, let’s first understand the terms. Power Query can work with several types of objects:
  • Mêz is a two-dimensional array consisting of several rows and columns.
  • Record (Record) – one-dimensional array-string, consisting of several fields-elements with names, for example [Name = “Masha”, Gender = “f”, Age = 25]
  • Rêzok – a one-dimensional array-column, consisting of several elements, for example {1, 2, 3, 10, 42} or { “Faith Hope Love” }

To solve our problem, we will be primarily interested in the type Rêzok.

The trick here is that list items in Power Query can be not only banal numbers or text, but also other lists or records. It is in such a tricky list (list), consisting of records (records) that we need to turn our directory. In Power Query syntactic notation (entries in square brackets, lists in curly brackets) this would look like:

{

    [ Find = “St. Petersburg”, Replace = “St. Petersburg” ] ,

    [ Find = “St. Petersburg”, Replace = “St. Petersburg” ] ,

    [ Find = “Peter”, Replace = “St. Petersburg” ] ,

hwd.

}

Such a transformation is performed using a special function of the M language built into Power Query – Table.ToRecords. To apply it directly in the formula bar, add this function to the step code there Kanî.

Ev bû:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Piştî:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

After adding the Table.ToRecords function, the appearance of our table will change – it will turn into a list of records. The contents of individual records can be seen at the bottom of the view pane by clicking in the cell background next to any word Rekor (but not in a single word!)

In addition to the above, it makes sense to add one more stroke – to cache (buffer) our created list. This will force Power Query to load our lookup list once into memory and not recalculate it again when we later access it to replace it. To do this, wrap our formula in another function – List.Buffer:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Such caching will give a very noticeable increase in speed (by several times!) with a large amount of initial data to be cleared.

This completes the preparation of the handbook.

Ew dimîne ku li ser bitikîne Home – Close and Load – Close and Load to… (Mal - Girtin & Barkirin - Girtin & Barkirin..), vebijêrkek hilbijêre Tenê pêwendiyek çêbikin (Tenê pêwendiyê çêbikin) and return to Excel.

Step 2. Loading the data table

Everything is trite here. As before with the reference book, we get up to any place in the table, click on the tab Jimare pişkov Ji Tablo/Range and our table Jimare gets into Power Query. Automatically added step cureyê guherî (Tîpa guherî) you can also remove:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

No special preparatory actions are required to be done with it, and we move on to the most important thing.

Step 3. Perform replacements using the List.Accumulate function

Let’s add a calculated column to our data table using the command Zêdekirina Stûnek - Stûna Xweser (Stûn lê zêde bike - Stûna xwerû): and enter the name of the added column in the window that opens (for example, corrected address) and our magic function List.Accumulate:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Ew dimîne ku li ser bitikîne OK – and we get a column with the replacements made:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Têbînî ku:

  • Since Power Query is case sensitive, there was no replacement in the penultimate line, because in the directory we have “SPb”, not “SPb”.
  • If there are several substrings to replace at once in the source data (for example, in the 7th line you need to replace both “S-Pb” and “Prospectus”), then this does not create any problems (unlike replacing with formulas from the previous method).
  • If there is nothing to replace in the source text (9th line), then no errors occur (unlike, again, from replacement by formulas).

The speed of such a request is very, very decent. For example, for a table of initial data with a size of 5000 rows, this query was updated in less than a second (without buffering, by the way, about 3 seconds!)

How the List.Accumulate function works

In principle, this could be the end (for me to write, and for you to read) this article. If you want to not only be able to, but also understand how it works “under the hood”, then you will have to dive a little deeper into the rabbit hole and deal with the List.Accumulate function, which did all the bulk replacement work for us.

Hevoksaziya vê fonksiyonê ev e:

=List.Accumulate(rêzok, toxim, berhevkar)

ko

  • rêzok is the list whose elements we are iterating over. 
  • toxim – initial state
  • berhevkar – a function that performs some operation (mathematical, text, etc.) on the next element of the list and accumulates the result of processing in a special variable.

In general, the syntax for writing functions in Power Query looks like this:

(argument1, argument2, … argumentN) => some actions with arguments

For example, the summation function could be represented as:

(a, b) => a + b

For List.Accumulate , this accumulator function has two required arguments (they can be named anything, but the usual names are rewş и vêga, as in the official help for this function, where:

  • rewş – a variable where the result is accumulated (its initial value is the one mentioned above toxim)
  • vêga – the next iterated value from the list rêzok

For example, let’s take a look at the steps of the logic of the following construction:

=List.Accumulate({3, 2, 5}, 10, (state, current) => state + current)

  1. Nirxa guherbar rewş is set equal to the initial argument toximIe state = 10
  2. We take the first element of the list (current = 3) and add it to the variable rewş (ten). We get state = 13.
  3. We take the second element of the list (current = 2) and plus it to the current accumulated value in the variable rewş (ten). We get state = 15.
  4. We take the third element of the list (current = 5) and plus it to the current accumulated value in the variable rewş (ten). We get state = 20.

This is the latest accumulated rewş the value is our List.Accumulate function and outputs as a result:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

If you fantasize a little, then using the List.Accumulate function, you can simulate, for example, the Excel function CONCATENATE (in Power Query, its analogue is called Nivîs.Têkelandin) using the expression:

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

Or even search for the maximum value (imitation of Excel’s MAX function, which in Power Query is called List.Max):

Di Power Query de bi fonksiyona List.Accumulate veguhertina nivîsê ya girseyî

However, the main feature of List.Accumulate is the ability to process not only simple text or numeric lists as arguments, but more complex objects – for example, lists-from-lists or lists-from-records (hello, Directory!)

Let’s look again at the construction that performed the replacement in our problem:

List.Accumulate(Directory, [Navnîşan], (state,current) => Text.Replace(state, current[Find], current[Replace]) )

What is really going on here?

  1. As initial value (toxim) we take the first clumsy text from the column [Navnîşan] our table: 199034, St. Petersburg, str. Beringa, d. 1
  2. Then List.Accumulate iterates over the elements of the list one by one – Destûr. Each element of this list is a record consisting of a pair of fields “What to find – What to replace with” or, in other words, the next line in the directory.
  3. The accumulator function puts into a variable rewş initial value (first address 199034, St. Petersburg, str. Beringa, d. 1) and performs an accumulator function on it – the replacement operation using the standard M-function Text.Replace (analogous to Excel’s SUBSTITUTE function). Its syntax is:

    Text.Replace( original text, what we are looking for, what we are replacing with )

    and here we have:

    • rewş is our dirty address, which lies in rewş (getting there from toxim)
    • current[Search] – field value Ji bo dîtina from the next iterated entry of the list Directory, which lies in the variable vêga
    • current[Replace] – field value Diberdasitin from the next iterated entry of the list Directorylying in vêga

Thus, for each address, a full cycle of enumeration of all lines in the directory is run each time, replacing the text from the [Find] field with the value from the [Replace] field.

Hope you got the idea 🙂

  • Bulk replace text in a list using formulas
  • Gotinên Birêkûpêk (RegExp) di Power Query de

Leave a Reply