• Introducing py_wsi for computer analysis on whole slide .svs images using OpenSlide

    A large, often unexpectedly time-consuming aspect of deep learning with whole slide images (WSI) is the data preparation phase. The images are gigabytes in size and simple things such as saving and loading patches can be painful. My new Python Package py_wsi allows for intuitive, painless patch sampling using OpenSlide, automatic labeling from Aperio ImageScope XML annotation files, and provides functions for saving these patches and their meta data into lightning memory-mapped databases. It is meant for fast prototyping, and will later include extensions to save to hdf5 files. The package can be forked from GitHub or installed via pip install py_wsi . It is highly recommended to download version >= 1.0.

    Read more
  • Managing memory with large datasets in TensorFlow

    In the past nine months or so I’ve been working primarily with high-resolution, high-magnification digital histopathology images, which are hundreds of thousands of pixels in diameter and gigabytes in size. Anyone working with large, content-rich datasets and TensorFlow has developed an ambivalent reaction to the following error message:

    tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[4096]
    

    Here are three simple ways to help maximise usage of available resources. Alternatively, you could go work at Google DeepMind where memory is not an issue.

    Read more
  • Simple proof of solution to the Monty Hall problem

    The Monty Hall problem was first famously solved by Marilyn vos Savant in 1990, and infamously argued by many intellects. The premise is roughly as follows:

    There are three doors: behind two are goats, and behind one is a car. You will receive the object behind the one you open. You must first choose one door, and the host (who knows the contents of all three doors) will then open one of the two remaining doors and show you a goat. You can then choose to open your original choice, or switch your choice to the remaining door. Are you more likely to get the car if you switch?

    Read more
  • Drawing and saving overlays on Google maps with Google Maps API in a Rails application

    Google Maps has a well-documented API for customising maps with your own imagery and content. For a side project built with Ruby on Rails, I wanted to allow users to freehand highlight routes on a map and save them for future viewing and modifying. The best way seemed to be to use complex polylines and save the lines as an overlay paired with the map window information. Here is a straightforward way of doing it cleanly within the Rails framework.

    Read more
  • Installing TensorFlow: fixing easy-install.pth and setuptools issues

    I recently installed TensorFlow on my Mac OS laptop and ran into some small issues. Maybe this will be useful to someone in the future, so here’s how they were resolved.

    The straightforward sudo pip3 install --upgrade tensorflow threw errors, so I took TensorFlow’s suggestion to try installing directly from the Python package URL (I am installing the CPU only version).

    Read more
  • Jekyll with GitHub hosting

    WordPress is a stable, well-documented, powerful blogging platform. I’ve developed WordPress sites, so I’ve seen firsthand the wide range of things it can do. But Jekyll has been piquing my curiousity for quite some time. This post will cover how I customised a few things in my clean installation of Jekyll. GitHub Pages gets credit for the free, seamless hosting.

    Read more
  • Learning LaTeX: floats

    Up until now, I have written all my papers, reports and documents using Microsoft Word. Granted, I have used the IEEE or some other helpful Word templates on occasion, but yes, I have re-worked references and renamed figures many times, manually. Find-and-replace is very manual, as is typing numbers.

    Last week my supervisor suggested I invest a few hours in learning LaTeX and exploring some kind of reference management system. A little exploration and some helpful tips from classmates, and I’ve committed to a workflow mostly involving Google Scholar Libraries, LaTeX, and BibTeX.

    Read more
  • Displaying images and plotting stuff with matplotlib.pyplot

    The matplotlib.pyplot library is my go-to for easy generation of graphs, charts, histograms and anything that can be plotted, and recently I have also been using it lots for viewing images.

    The only hitch is, I can never remember the syntax for basic, common things like displaying more than one plot at a time on a graph, or showing a graph with multiple lines and a matching legend box. I confess to having googled “pyplot subplots” a few too many times before writing up my own functions which do the job and are easy to use and remember.

    Read more
  • Thoughts on getting good marks, from someone who does not always get good marks

    After an undergraduate degree at McGill University and most of a masters at the University of Leeds, you would think that I have learned how to do a coursework the way the professor expects it to be done.

    Unfortunately, I can’t say I’m an expert. But here are a few thoughts on the subject.

    Read more

subscribe via RSS