More on Arrays

In the previous section, we discussed the basics of using arrays in Julia. In this section, we shall briefly look at a few ways to make working with arrays a little easier.

More on Indexing

When we discussed the basics of using arrays, indexing was done on a single entry basis, e.g.

a = [i^2 for i = 1:10]
# 10-element Vector{Int64}:
#    1
#    4
#    9
#   16
#   25
#   36
#   49
#   64
#   81
#  100

a[3]
# 9

You can however also specify an array of indices to return several values at once:

a[4:6]
# 3-element Vector{Int64}:
#  16
#  25
#  36

This opens up more sophisticated selection strategies. Let’s just select the values between 10 and 30:

a[10 .<= a .<= 30]
# 2-element Vector{Int64}:
#  16
#  25

Here 10 .<= a .<= 30 returns a BitVector indicating the indices where a holds values between 10 and 30, and we simply use that to index into a to get the values.

10 .<= a .<= 30
# 10-element BitVector:
#  0
#  0
#  0
#  1
#  1
#  0
#  0
#  0
#  0
#  0

Note

Note the use of the period before the comparison operators: .<=. This broadcasts the operator, i.e. applies it element-wise to the array.

We can combine several tests of the data into a single selection function (called a predicate¹ function). For example, let’s say we wanted to select only even values, larger than 30. The can use the findall function to return a array of indices that meet our requirements:

# returns true when all the tests are passed
function mytest(x)
    return iseven(x) && (x > 30) 
end

idx = findall(mytest, a)
# 3-element Vector{Int64}:
#   6
#   8
#  10

a[idx]
# 3-element Vector{Int64}:
#   36
#   64
#  100

In addition to findall, there is also findfirst and findlast, whose use should be obvious from their names.

To find the smallest or largest entries, use findmax and findmin. These take a function, f(x) and return the index in a where f(x) is maximal (or minimal), as well as the value of f(x). To just find the largest/smallest entry, you could just pass identity as the function, although minimum and maximum may be more efficient as they won’t try to apply a function first. If you need both the smallest and largest values, use extrema.

a = rand(5)
# 5-element Vector{Float64}:
#  0.6543374123600577 
#  0.39436407973246723
#  0.6426654745232094 
#  0.7604450070679262 
#  0.35561125088056034

findmax(identity, a)
# (0.7604450070679262, 4)

minimum(a)
# 0.35561125088056034

maximum(a)
# 0.7604450070679262

extrema(a)
# (0.35561125088056034, 0.7604450070679262)

A closely related option is argmax. This will return the value in a where the function is maximised.

findmax(abs2, a)
# (0.5782766087745382, 4)

argmax(abs2, a)
# 0.7604450070679262

Here abs2 is a standard Julia function that returns the square of the absolute value (so just the square for real inputs). findmax returns the maximum value of the function (0.57827…), and the index of the entry in a that maximised the function (4). argmax returns the value of the entry in a that maximised the function (0.76044…).

There is of course also findmin and argmin.

You can also check if any or all of the entries in an array return true for your predicate function, using any and all.

Multidimensional arrays can also be index with a one-dimensional index:

H = [1//(i + j -1) for i = 1:5, j = 1:5]
# 5×5 Matrix{Rational{Int64}}:
#  1//1  1//2  1//3  1//4  1//5
#  1//2  1//3  1//4  1//5  1//6
#  1//3  1//4  1//5  1//6  1//7
#  1//4  1//5  1//6  1//7  1//8
#  1//5  1//6  1//7  1//8  1//9


H[2]
# 1//2

H[5]
# 1//5

The index refers to the position in memory. Julia uses column-major ordering, so the array entries are arranged in memory in such a way that the first column’s entries are sequential, followed by the second column’s entries and so on.

This means the fastest way to cycle through all the entries in a matrix is one column at a time. Let’s see how much of a difference this makes:

N = 10_000
10000

julia> A = rand(N, N)
10000×10000 Matrix{Float64}:
 0.125774   0.99546    0.818036   0.336722    0.0801124  …  0.424328   0.211697    0.384141   0.991287   0.803988
 0.352014   0.275881   0.283768   0.255942    0.697316      0.338242   0.309393    0.0545593  0.951967   0.801345
 0.0798538  0.370381   0.18909    0.045718    0.40401       0.0178459  0.45153     0.662736   0.229347   0.978208
 0.20596    0.374736   0.697504   0.149587    0.557356      0.29373    0.0687817   0.743411   0.710738   0.977884
 0.272887   0.539657   0.293967   0.563295    0.48734       0.591128   0.73146     0.16616    0.0643573  0.37605
 0.856904   0.697862   0.431296   0.614269    0.44666    …  0.155158   0.0712897   0.833298   0.489254   0.451948
 0.180036   0.627582   0.939213   0.683959    0.130053      0.876024   0.168418    0.252402   0.100271   0.29253
 0.977263   0.824657   0.485721   0.902796    0.657551      0.381646   0.284706    0.954012   0.277599   0.410895
 0.273142   0.505615   0.190199   0.628423    0.345799      0.708315   0.0279852   0.863847   0.0679989  0.982656
 0.0444545  0.507261   0.472975   0.936737    0.460258      0.151091   0.186248    0.256901   0.888023   0.43243
 0.383873   0.344713   0.659127   0.67701     0.473101   …  0.265238   0.677027    0.459148   0.175624   0.0108319
 0.0978964  0.949814   0.121223   0.483722    0.303262      0.72285    0.826869    0.880193   0.0725074  0.397457
 0.837655   0.791215   0.301445   0.757322    0.725472      0.728449   0.247278    0.492649   0.757416   0.0875311
 ⋮                                                       ⋱  ⋮
 0.0118103  0.823731   0.363951   0.00415639  0.38555       0.0709323  0.910507    0.648764   0.716338   0.19358
 0.602335   0.840785   0.151156   0.645298    0.643621      0.104928   0.925803    0.116246   0.0470635  0.118243
 0.442508   0.0942871  0.687552   0.878454    0.765566   …  0.330269   0.976848    0.657189   0.622884   0.527785
 0.036136   0.934861   0.0186539  0.767235    0.8053        0.0876643  0.476363    0.0331153  0.973213   0.340101
 0.698365   0.928986   0.899297   0.37211     0.641652      0.773955   0.00228385  0.156724   0.920929   0.438071
 0.942956   0.525783   0.733806   0.00928839  0.32164       0.585651   0.930962    0.709463   0.959226   0.641829
 0.140685   0.315704   0.467777   0.456009    0.226625      0.0200179  0.969894    0.707265   0.0822367  0.899115
 0.280121   0.979617   0.212137   0.456986    0.958369   …  0.847566   0.520314    0.449312   0.6345     0.0621001
 0.552303   0.918199   0.969841   0.0787598   0.631372      0.866611   0.311826    0.56646    0.65133    0.206941
 0.829611   0.429891   0.964987   0.519071    0.924919      0.550499   0.266589    0.486168   0.982127   0.55406
 0.0242417  0.0808636  0.317017   0.802046    0.815374      0.944076   0.0478049   0.100928   0.573823   0.331286
 0.138703   0.308092   0.889499   0.257449    0.975605      0.826279   0.688673    0.894047   0.416685   0.23544

function sum1(A)
# Row-wise addition
    S = 0.0
    n, m = size(A)
    for i = 1:n, j=1:m
        S += A[i, j]
    end

    return S
end
# sum1 (generic function with 1 method)

function sum2(A)
# Column-wise addition
    S = 0.0
    n, m = size(A)
    for j = 1:n, i=1:m
        S += A[i, j]
    end

    return S
end
# sum2 (generic function with 1 method)

function sum3(A)
# Allow Julia to specify best indexing
    S = 0.0
    for i in eachindex(A)
        S += A[i]
    end

    return S
end
# sum3 (generic function with 1 method)

using BenchmarkTools

@btime sum1($A)
#   1.449 s (0 allocations: 0 bytes)
# 4.999981456219146e7

@btime sum2($A)
#   91.894 ms (0 allocations: 0 bytes)
# 4.999981456220286e7

@btime sum3($A)
#   92.086 ms (0 allocations: 0 bytes)
# 4.999981456220286e7

Here we define three functions for summing up the values in the matrix A.

sum1 adds up the values in A one row at a time. sum2 adds the values one column at a time and sum3 asks Julia to supply the best order with the eachindex() iterator.

So, by picking the right order for our indices, we calculate the sum about 15X faster! The difference will be larger with larger arrays.

Note

It is never a good idea to benchmark code in global scope, as the compiler can do no optimisation in that case. This is because the types of any variables can change at any point. It is always better to put your code in a function. Julia has a built-in @time macro, but we use @btime from BenchmarkTools here. It is a more comprehensive, and more accurate benchmarking toolset. In addition, we pass $A instead of just A so the contents of the matrix is sent to the function (effectively as a constant), rather than a pointer to the global variable. This allows more accurate benchmarking.

To see the loss of speed in global scope, we run a simple timer using the best option from above, but not wrapped in a function:

S = 0.0
# 0.0

@time for i = eachindex(A)
           S += A[i]
       end
#  39.595608 seconds (500.00 M allocations: 8.941 GiB, 2.19% gc time)

Note the huge increase in both time and memory use. The exact extent of performance loss will vary depending on the state of you computer, but it is rarely negligible.

The eachindex iterator only gives you the values in the array. If you also need an index, you can use enumerate instead:

A = collect(1:2:50)
# 25-element Vector{Int64}:
#   1
#   3
#   5
#   7
#   9
#  11
#  13
#   ⋮
#  37
#  39
#  41
#  43
#  45
#  47
#  49

for (i, val) in enumerate(A)
           if 10 <= val <= 20
               println("Entry $i is equal to $val")
           end
       end
# Entry 6 is equal to 11
# Entry 7 is equal to 13
# Entry 8 is equal to 15
# Entry 9 is equal to 17
# Entry 10 is equal to 19

Pre-allocation and making the most of mutating functions

Consider the function below:

function mycalc(a, b, c, start, step, stop)
    buffer = zeros(length(start:step:stop))

    for (i, x) in enumerate(start:step:stop)
        buffer[i] = a + b * x + c * x^2
    end
        
    return sum(buffer)
end

 @btime mycalc(1, 2, 3, 0.0, 0.1, 100000.0)
#   3.497 ms (2 allocations: 7.63 MiB)
# 1.0000115001105e16

This function allocates memory to hold the array (buffer), every time it is called. Here it is a smallish array, but in more complicated calculations, or when you call this function many times, these allocations add up quickly.

We can avoid this repeated allocation by creating the array once, outside the function, and then passing it to the function. The function will then modify this block of memory - mutate in the jargon. We call this type of function a mutating function, and in Julia convention, indicate this to the user by adding an exclamation mark to the name².

buffer = zeros(1000001)

function mycalc!(buffer, a, b, c, start, step, stop)
    for (i, x) in enumerate(start:step:stop)
        buffer[i] = a + b * x + c * x^2
    end
    return sum(buffer)
end



@btime mycalc!(buffer, 1, 2, 3, 0.0, 0.1, 100000.0)
#   2.294 ms (0 allocations: 0 bytes)
# 1.0000115001105e16

As you can see, the mutating version (sometimes called the in-place version) is quite a bit faster than the allocating (out-of-place) version. Allocating memory is not a quick step and as a general rule, something to be avoided in code where speed is critical. When you are not interested in squeezing every last nanosecond out of your code, there is no reason to be particularly focused on this. In that case rather write code that is easier to read, as a gift to your future self.

And, since this is also important, note that there are usually several ways to get rid of allocations, other than pre-allocating memory:

function mycalc2(a, b, c, start, step, stop)
    s = 0.0
    for x in start:step:stop
        s += a + b*x + c*x^2
    end
    return s
end

@btime mycalc2(1, 2, 3, 0.0, 0.1, 100000.0)
#   2.224 ms (0 allocations: 0 bytes)
# 1.0000115001103872e16

Here we just didn’t store all of the intermediate calculations, so we didn’t need to allocate memory to store them in. If you don’t need it again later, don’t waste CPU time on putting it somewhere.

Application - Searching for prime numbers

The Sieve of Erastothenes is a method for finding prime numbers attributed to Erastothenes of of Cyrene, the third chief librarian at the Library of Alexandria, and dates back at least to the 2nd century. Erastothenes was also the first person to accurately culculate the Erath’s circumference.

The sieve is a very simple algorithm. You start with a list of consecutive numbers, starting with 2, up to the largest number you are interested in (say, N). Starting with the first value, remove all multiples of it. Then select the next remaining number, and remove all multiples of this number. This continues up to N/2, since there are no multiples of numbers larger than N/2 in the list. The numbers that have not been removed, are all prime numbers.

Let’s write a simple version of the sieve to find all prime numbers smaller than a specified value for N:

function primesieve(N=2_000_000)
    # Create the list, setting even numbers to false already
    primes = isodd.(1:N) # This allocates an array
    primes[1] = false # Also exclude 1
    primes[2] = true # And remember that 2 is prime!

    # Start the sieve at 3
    nextval = 3
    while nextval <= N ÷ 2 # ÷ does integer division
        if primes[nextval] # Still in the list?
            primes[2*nextval:nextval:N] .= false # Then remove all multiples
        end
        nextval += 1 # Start looking for next entry in list
        while !primes[nextval] && nextval <= N
            nextval += 1
        end
    end

    return (1:N)[primes] # This allocates an array
end

using BenchmarkTools
@btime primesieve()
#   5.970 ms (7 allocations: 1.37 MiB)

Most of the code above should be easy to follow. Perhaps one thing worth explaining is the last line of the function:

return (1:N)[primes]

This creates a UnitRange object - 1:N. This is like an array, but lazily constructed. Lazy here is a good thing. It means the array is not actually created and memory allocated. The value of interest in the iterator is simply calculated on-the-fly when it is requested. Since a UnitRange is simply a list of values with start, step size and largest value (start, step, stop) all known, the n-th value can easily be found when you index into the iterator.

Here we index with a BitArray (primes), which will return all the entries in the iterator for which the corresponding entry in primes is true. This requires Julia to allocate an array with the results.

From the benchmarking results, we see that there were 7 allocations. If we need to squeeze every last millisecond of performance from the code, we can try to remove these allocations:

function primesieve!(primes)
    N = length(primes)
    # Create the list, setting even numbers to false already
    primes[1] = false # Exclude 1
    primes[2] = true # And remember that 2 is prime!

    # Start the sieve at 3
    nextval = 3
    while nextval <= N ÷ 2 # ÷ does integer division
        if primes[nextval] # Still in the list?
            primes[2*nextval:nextval:N] .= false # Then remove all multiples
        end
        nextval += 1 # Start looking for next entry in list
        while !primes[nextval] && nextval <= N
            nextval += 1
        end
    end

    return primes
end

N = 2_000_000
primes = isodd.(1:N)
idx = primesieve!(primes)
primes = (1:N)[idx]

startval = isodd.(1:N)
@btime primesieve!($primes) setup=(primes = copy(startval))
#   5.469 ms (0 allocations: 0 bytes)

So, with a very simple change, we eliminated the allocations and the code is now ~8% faster.

Note

You will see some differences in how we benchmarked the second function. 1. We use a setup block to reinitialise the primes array for each repeat of the benchmarking 2. We interpolate the value of primes into the function call, by using $primes in the call.

The latter is especially important when passing variables to a function you are benchmarking. Since primes is a global variable, Julia cannot do full optimization of the code, as it must be able to handle any changes to primes that can cause it’s type to change. Benchmarking in the global scope is therefore not best practice. We get around this by passing the interpolated value of the variable, so Julia actually creates a constant value from the variable and passes that. This gives more accurate benchmarking results.

Sorting and searching

Sorting and searching is an extremely common activity in coding. As a result, Julia has a rich set of built-in tools for this, so you do not have to build your own.

The primary functions to call are sort (or sort! to modify in-place) and sortperm (and sortperm!), if you just want a permutation array, p, such that A[p] puts the entries of A in the correct order. This is another nice indexing trick in Julia!

The function takes several inputs:

sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)

Here, v is the vector to sort, alg can be one of InsertionSort, QuickSort, PartialQuickSort(k) or MergeSort, but you can also specify custom sorting algorithms by overloading the defalg() function for your array type.

The lt (less than) parameter specifies the method used to determine if one entry in the array is less than another. The default method is to call Julia’s isless() function, but you can specify any custom function, as long as it meets the requirements specified in the Julia documentation:

by specifies a function that is applied to the entries of v before they are sorted. The entries are sorted based on the results of this function for each entry. The default function, identity(), just returns the input value, so the array is sorted based on the values of the entries. You could however, sort values based on the return values of any function.

rev specifies whether the entries are sorted in increasing or decreasing order, based on the order specification given in order. Having both rev and order to specify the sequence in which results are ordered, is admittedly a little confusing, but is needed to allow alternative implementations for the lt function, that could include more options than Forward and Reverse.

Many other Julia functions, such as searching, can take advantage of knowledge on whether the array has been sorted beforehand. You can check this for an arbitrary array (assuming you don’t sort it yourself) with the issorted() function.

Searching can then be done with searchsorted() or searchsortedfirst(). The former returns all indices in the array where the entries match the search criteria and the latter returns only the first (lowest index) entry that matches. The equivalent searching functions, where sorting is not assumed, are findall() and findfirst().

Arrays of structs or structs of arrays?

Sometimes it can be tricky to decide on the best data structures to store your data in. If, as an example, we were interested in saving coordinates of points (x, y and z), we could create a struct to store the three values and then have an array of these structs for the various data points.

struct Point{T}
    x::T
    y::T
    z::T
end

points = [Point(randn(3)...) for _ in 1:100]
points[1].x
# 0.36386577310109675
points[2].y
# 0.7899375459147395

Note

points = [Point(randn(3)...) for _ in 1:100] may require some explanation (or at least a reminder):

randn(3) returns and array of three normally distributed random numbers. The constructor function for the Point struct expects three inputs - the x, y and z values. The ... is the splatting operator, which splits the array into the three parameters that the constructor expects.

We use _ as the loop variable in the array comprehension to explicitly indicate that this is a variable we are not going to care about and which can be discarded by the compiler.

Sometimes however, we may just want to access all of the x-values, or all of the y-values. Having an array of the x-values and an array of the y-values would then be very useful. Instead of now looping through the points array, and extracting the x field values, we can use the StructArrays package.

Using StructArrays, we would do the following:

using StructArrays

struct Point{T}
    x::T
    y::T
    z::T
end

points_sa = StructArray{Point}((randn(100), randn(100), randn(100)))
# 100-element StructArray(::Vector{Float64}, ::Vector{Float64}, ::Vector{Float64}) with eltype Point:
#  Point{Float64}(0.2296170085088406, -0.754060138141318, -0.03293009940470027)
#  Point{Float64}(1.1490000593949594, -1.1958286478128815, -2.1977494726346016)
#  ⋮
#  Point{Float64}(-1.356570072198423, 0.2094973793724703, 0.5984836696889012)
#  Point{Float64}(0.3323952236779933, -0.01368437886905607, -1.3470011216406736)

points_sa[1]
# Point{Float64}(0.2296170085088406, -0.754060138141318, -0.03293009940470027)

points_sa.x
# 100-element Vector{Float64}:
#   0.2296170085088406
#   1.1490000593949594
#   ⋮
#  -1.356570072198423
#   0.3323952236779933

This package can give immense flexibility at very little loss of efficiency.

Footnotes

A predicate function is a function that returns true or false based on some specified conditions.↩︎
The exclamation mark itself is just a note to the user - it makes no difference to the compiler. A function without an exclamation mark can mutate parameters, and a function with an exclamation mark could as easily not mutate.↩︎