Feel free to explore the blog index, the 2020 blog, or subscribe in RSS.

Cooking and Crypto


Lately I've been pretty busy, so haven't put aside much time to write. However, things have now settled down at work and with my hobbies, so it is worth giving an update.

Early last month Luke Smith posted a video lamenting the fact that recipe websites are incredibly bloated. Between the ads and the endless pages of life stories, it would be worth having a simple website that hosted recipes and nothing else. Fortunately, provided just that. Unfortunately, their business model consisted of scraping recipes and images from other websites. Aside from the issues of rehosting copyrighted material, food bloggers stood to lose their ad-revenue, and so mobbed the creators. In response, recipeasly was taken down.

Rejecting all that free press (whether good or bad) seems a waste. So I immediately bought the, .net, and .ca domains and set up a website of my own. So far, you can search recipes, browse by category, or via an image gallery. Each recipe also includes a printable index card. In the upcoming weeks I'll add a submission form, a rating system for each recipe, and possibly comments. Since I started, Luke Smith has also put up the site However, I don't care if my cooking is "based" and Luke is generally opposed to JS, so there is plenty of room for both our websites. As a way of marketing my site, and to ensure each recipe has a photo, I may hold a photo contest in the future with a cash or cryptocurrency prize.

Speaking of cryptocurrency, I've also spent a fair amount of time reading about the Basic Attention Token (BAT) since Brave released their new roadmap. This is in part because the price was steadily increasing. For example, the figures below demonstrate that the daily returns and volatility (averaged over each month) of BAT have steadily increased in recent months. Moreover, after a long period of underperforming relative to bitcoin, BAT has been outperforming since the start of 2021. As such, I managed to buy a fair amount at $0.68 before it shot up to $1.30.

performance of bat relative to bitcoin
Left: daily log-returns and risk (SD log-returns) from BAT, averaged over each month since April, 2019. Right: difference between BAT and Bitcoin log-returns.

However, I am more interested in the long-term use cases for BAT, and likely won't sell for the next few years. As of today, Brave now has 29 million active users and user growth has been incredibly steady. Since January 2019, users have increased by 124.92% per year, as indicated by log-linear regression. Moreover, this increase is extremely reliable, having an r-squared of 0.9877. In other words, 98.77% of the change in user numbers is accounted for as an exponential increase over time. If the current trend holds, Brave can expect to have 69 million users one year from now and 155 million in two years. However, with the upcoming release of self-serve ads, a decentralized exchange, and the ability to pay for goods and services in BAT, my estimate is likely conservative.

number of brave users over time
Total number of Brave monthly active users since release of the browser.

I've also been spending a good deal of time researching IOTA, as a fast, feeless, energy efficient network that scales limitlessly will be necessary for the future of cryptocurrency. Moreover, I'd like to set up a sensor network in my community, and IOTA seems like a promising platform for such projects. However, this is a topic for a future post.

Fri, 2021/04/02 06:02:30 +0000

The Weekly Bitcoin Dump


There's a long standing belief that crypto dumps over the weekend, one reinforced by prices over the last few days. If systematic highs and lows do exist the market should adapt, causing the timing of the weekly dump to drift over time. Fortunately, it is easy to verify whether this belief holds true from historical price data. Hourly closing Bitcoin prices on Binance between August, 2017 and February 2020 are available from Using this data, I calculated hourly prices in each week as the percentage of the weekly maximum. Afterwards, it is a simple matter to average all the weeks in a year, providing an indication of when Bitcoin is at its lowest.

bitcoin prices
Average hourly Bitcoin prices during the week, as a percentage of the weekly maximum.

As seen in the above figure, Bitcoin is generally at its lowest and highest during the weekend. Between 2017-2021, weekly lows moved from Sunday, to Friday, and back to Sunday. However, over the first months of 2021 the weekly dump has occurred Monday morning. Conversely, the best time to sell has drifted from Saturday to Sunday.

bitcoin price trends
A: A 5th degree polynomial fit to Bitcoin prices. B: Detrended Bitcoin prices.

However, it could be argued that this data only demonstrates the average linear trend in Bitcoin prices over the week. In years where the price is steadily increasing, one should expect that the Saturday price will exceed that of the preceding Sunday. To correct for this bias, I detrended the price by fitting a 5th degree polynomial. As seen in the above figure, this was subtracted from the actual price, so that it now fluctuates around an average of $9900.

detrended bitcoin prices
Detrended average hourly Bitcoin prices during the week, as a percentage of the weekly maximum.

The same hourly prices over the week were calculated using this detrended data. As seen above, prices have still tended to pump on Saturday since 2017. However, this is usually preceded by a dump on Friday morning. More recently, Bitcoin prices have tended to dump Monday, rather than on the weekend. Therefore, the belief that crypto dumps over the weekend does not hold true every year.

Thu, 2021/03/04 06:28:44 +0000

Pulling User Data from Fosstodon


I stopped using Facebook a while ago. Aside from the issues with privacy and free expression, it was a time sink, and the news feed only served as a source of unnecessary anger. However, I have been using the Mastodon instance Fosstodon lately. It's largely devoid of international news, I enjoy the community, and has a code of rules that suit my preferences.

I'm interested in how Fosstodon operates, when its members are the most active, and how often they interact with one another. As such, I've been playing with the Mastodon API, in order to pull statuses and user information from the server. While I'm not much for Python, the library is an easy tool for interacting with the API. It only requires registering an application and logging in to your server.

from mastodon import Mastodon

# create an application
     api_base_url = '',
     to_file = 'myapp_clientcred.secret'

# log in to our server
mastodon = Mastodon(
    client_id = 'myapp_clientcred.secret',
    api_base_url = ''
    to_file = 'pytooter_usercred.secret'

After connecting to your instance, the mastodon.timeline() function pulls statuses from the home, local, or public timelines, or statuses with a given tag, starting from max_id. For me, each call returned the last 40 statuses, so I used the following loop to pull the last 50000 toots from the local timeline, from May 28, 2020 to February 7, 2021. Keep in mind there is a request limit of 300 per 5 minutes, so the time.sleep() funtion can be used to space out requests.

# request files from local timeline, starting at max_id
myFile=mastodon.timeline(timeline='local', max_id=105687600003001040, limit=40)

# use last status id at starting point for next request
for x in range(1249):
    myFile=mastodon.timeline(timeline='local', max_id=myId, limit=40)

The end result is a list of dictionaries that contain the time and content of each status, tags, number of replies, boosts, etc. Also nested within is a dictionary detailing the associated account, creation date, total followers, etc., that I pulled into a separate list. Given I am only insterested in the frequency and timing of toots, I also removed status tags, content, and other unnecessary information. Finally, I put each list into a pandas data frame and exported these to excel files.

# put account information into separate list
accounts = [output[0]['account']]
for d in output[1:]:

# remove unwanted fields
for d in output:
    for e in ['in_reply_to_account_id', 'spoiler_text','uri','favourited','reblogged',
        d.pop(e, None)
for d in accounts:
    for e in ['username','display_name','note','url','avatar','avatar_static',
        d.pop(e, None)

#convert lists to data frames
import pandas
output = pandas.DataFrame(output)
accounts = pandas.DataFrame(accounts)

# delete time zone from status and account creation dates
dfPosts['created_at'] = dfPosts['created_at'].astype(str).str[:-6]
dfAccounts['created_at'] = dfAccounts['created_at'].astype(str).str[:-6]

# cast account id to string to prevent rounding errors
dfPosts['id'] = dfPosts['id'].astype(str)

CSV files containing the Fosstodon statuses from May 28, 2020 to February 7, 2021 and associated accounts are available here.

Sat, 2021/02/13 00:48:25 +0000

Scanning Negatives the Hard Way


Lately I've been getting back into film photography. Some years ago I took a fair number of medium format pictures using a Holga pinhole camera. I never digitized any of my negatives, mostly due to the lack of a decent scanner. I still lack a proper scanner, having only a sub-par Brother model meant for documents. However, I may be able to use a few mathematical tricks to make it work far past its intended potential.

scan of film negative
A poor-quality film scan.

As seen in my first scan, the main problem is I can't control the exposure time on my scanner and negatives end up much too dark. While some information can be recovered by correcting the contrast and gamma of each image, this is overwhelmed by the noise produced by the scanner. Moreover, I was lazy so the film was obscured by dust, hairs, and fingerprints. Fortunately, the random noise can be removed by averaging multiple scans. However, I also wanted to try removing various sources of systematic noise (i.e., prints, hairs, film imperfections). Ultimately, I took 125 scans of the negative in four different positions on the scanner bed, each with their own imperfections. After reading each image into Matlab, I corrected the contrast, gamma, and registered each image to align differences in position and rotation.

img = zeros(2692,5738,125);
for i = 1:125
    % read in scans and correct gamma and contrast limits
    img(:,:,i) = 1-mat2gray(double(imread(['Image_',num2str(i),'.bmp'])));
    img(:,:,i) = imadjust(img(:,:,i),[0.5, 1],[],10);

% register images, allowing for rotation and translation
mkdir Registered % directory for registered images
regImg = zeros(2692,5738,125);
[optimizer, metric] = imregconfig('monomodal'); % assume similar brightness and contrast
optimizer.MaximumIterations = 1000; % increase iterations
regImg(:,:,27)=img(:,:,27); % reference image

idx = [1:26, 28:125];
for i=1:length(idx)
    regImg(:,:,i) = imregister(img(:,:,i), img(:,:,27), 'rigid', optimizer, metric);

meanImg = mean(regImg,3);
figure; % display mean image

gamma and contrast corrected film scans
Noisy, gamma and contrast corrected scans.

The end result is pretty rough. However, we can remove the noise produced by the scanner by taking the average of each pixel across all scans.

averaged scan
The average of all 125 negative scans.

However, two sources of noise still remain. The various hairs and fingerprints present in the different scans have to be removed. Second, regular vertical lines are present throughout the negative. While these would not be visible in a proper scan, the increased image contrast has made them more present. While the fingerprints will be difficult to address, the lines are regular and can be removed my moving the image into the frequency domain.

axis image;

amplitude spectra
The amplitude spectra of the average image.

As seen above, a strong horizontal line is present throughout the frequency spectra of our image, representing the regular pattern of vertical lines. By removing these frequencies, we should also remove the lines without affecting the rest of our image. So, we will need a function to filter out frequencies ±2° from vertical. As the lines are closely spaced, we don't need to filter out low frequencies, only those above 200 cycles/image.

function thefilter = ofilter(oLow,oHigh,dim);
    % a bandpass filter of size dim that passes orientations between oLow and oHigh
    % horizontal and vertical are 0 and 90 degrees
    t1 = ((0:dim-1)-floor(dim/2))*(2/(dim));
    [t1,t2] = meshgrid(t1,t1);
    clear t2
    if oHigh>=oLow
        d=find((a=oHigh) & t1>0); % find out-of-band frequencies
        d=find((a=oHigh) & t1>0); % find out-of-band frequencies
    if ~isempty(d)

function thefilter = bpfilter(lowf,highf,dim);
    % a bandpass filter of size dim that passes frequencies between lowf and highf
    for yk=1:dim
        d=sqrt(dx.*dx + dy*dy);
        val1=find((d > lowf)&(d < highf));

function thefilter=filter2d(lowf,highf,orient1,orient2,dim);
    % combines ofilter and bpfilter to pass frequencies between lowf and highf
    % and orientations between orient1 and orient2

Given the function filter2d, we can apply it to each image to remove all frequencies above 200 cycles/image that are within 2° of vertical.

thefilter = filter2d(200,5738,88,92,5738);
thefilter = 1-thefilter(1524:4215,:);

for i=1:125
    amp = amp.*thefilter;
    ft = amp.*exp(sqrt(-1)*phase);

filtered image
Frequency filtered image.

Finally! We have removed the lines and are one step closer to having a clean picture. All that remains is to remove the various hairs and fingerprints present in each image. Fortunately, the position of these flaws varied between scans. I wiped down the negative, removing some prints and causing others to be deposited. Since presence or absence of these flaws are independent of the image, we can use independent component analysis (ICA) to isolate them and subtract them from our average image. The FastICA package is one of the best implementations of ICA.

filtImg=filtImg(59:2643,50:5678,:); % to avoid exceeding RAM capacity
[rows, columns, scans] = size(filtImg);
pcaArray = reshape(filtImg, rows * columns, scans)'; % reshape array for ICA
mu = mean(pcaArray,1); % the average image, ICA automatically subtracts this
[icasig, A] = fastica(pcaArray, 'lastEig', 8, 'numOfIC', 8); % take first 8 components
baseCase = mean(A(76:125,:),1); % scores for fingerprint-free images

% positive and negative parts of the first three components
ICAp1 = icasig(1,:); ICAp1(ICAp1 < 0) = 0;
ICAm1 = -icasig(1,:); ICAm1(ICAm1 < 0) = 0;
ICAp2 = icasig(2,:); ICAp2(ICAp2 < 0) = 0;
ICAm2 = -icasig(2,:); ICAm2(ICAm2 < 0) = 0;
ICAp3 = icasig(3,:); ICAp3(ICAp3 < 0) = 0;
ICAm3 = -icasig(3,:); ICAm3(ICAm3 < 0) = 0;

independent components
The first four independent components.

As seen above, ICA has isolated all the fingerprints and hairs that obscured our image within the first three independent components. Since all scans had defects, the absence of a flaw in one image implies the presence of another elsewhere. Therefore, each component captures both the presence (positive values & lighter pixels) and absence (negative values & darker pixels) of various features. As we want to make an image that contains none of these flaws, we separate the positve and negative values of each component.

All that remains is to find the ideal combination of component weights that removes all blemishes from the average image (mu). My approach was to generate random weights for two sets of components, add these to the average image, and indicate which looked better. It is a bit hacky, but a few iterations resulted in an image that looked half decent.

WeightArr = [rand(1,2).* 0.026 - 0.013; ...
    rand(1,2).* 0.016 - 0.008; ...
    rand(1,2).* 0.0608 - 0.0304];
for i = 1:100
    tmpArr = [rand(1,2).* 0.026 - 0.013; ...
        rand(1,2).* 0.016 - 0.008; ...
        rand(1,2).* 0.0608 - 0.0304];
    figure('units','normalized','outerposition',[0 0 1 1]);
    imshow(reshape((mu+ICAp1.*WeightArr(1,1)+ICAp2.*WeightArr(2,1)+ICAp3 ...
    .*WeightArr(3,1)+ICAm1.*WeightArr(1,2)+ICAm2.*WeightArr(2,2)+ICAm3 ...
    imshow(reshape((mu+ICAp1.*tmpArr(1,1)+ICAp2.*tmpArr(2,1)+ICAp3 ...
    .*tmpArr(3,1)+ICAm1.*tmpArr(1,2)+ICAm2.*tmpArr(2,2)+ICAm3 ...
    w = waitforbuttonpress;
    if w
        WeightArr = tmpArr;

imwrite(uint16(mat2gray(reshape((mu+ICAp1.*WeightArr(1,1)+ICAp2.*WeightArr(2,1) ... 
+ICAp3.*WeightArr(3,1)+ICAm1.*WeightArr(1,2)+ICAm2.*WeightArr(2,2)+ICAm3 ...
.*WeightArr(3,2) )', rows, columns) ).*65535),'FinalImg.tif');

the final image
The final image, after filtering, ICA, and averaging.

And here it is, the final image. While it still could use a lot of work, the end result is a far cry from what my scanner initially produced. However, the main lesson is that I should probably buy a better scanner.

Sun, 2021/01/24 05:15:35 +0000

Clean Drinking Water


The importance of clean groundwater is pretty universal, about half of the world's population uses it for drinking water. However, ensuring that groundwater is clean and salt-free usually requires installation of a permanent monitoring well, a costly and often impractical process. Installing a well requires drilling equiptment, the wells aren't cheap, and they often get destroyed.

Conversely, measuring the salt content of a soil sample is a fairly simple process. It's temporary, doesn't require much equiptment, and you dont have to worry about landowners or animals destroying your handiwork. A while ago I published a paper on how to estimate the salt content in groundwater from soil samples. I've been learning JavaScript lately, so I made a tool for estimating groundwater quality from soil samples based on my findings. If you work in the environmental industry, hopefully you find it useful.

Thu, 2021/01/07 17:58:14 +0000