Have you ever tried passing a length of thread through the eye of a needle? I’m sure you’d agree, it’s a task that requires a high level of dexterity. Recent advances in nanotechnology, however, have made possible the far more remarkable ‘threading’: guiding single polymer molecules through pores 100,000,000 times smaller! Moreover, these ‘nanopores’ come in two varieties: biological, which actively and passively control the passage of ions within the cell, and solid-state, predominantly associated with silicon-based materials, and nowadays 2D materials such as graphene and transition metal dichalcogenides.
What good does this do? To answer this skeptical schoolboy question, one must begin at the Drndić lab. We pursue a lead first proposed in 1996 by Kasianowicz and his co-workers at the national Institute of Standards and Technology. The rationale behind Kasianowicz’s experiment was simple: DNA, being a charged polynucleotide, can be driven through a biological nanopore channel in a linear, head-to-tail fashion by an electric field (the DNA and nanopore are submerged in an ionic liquid). While the DNA is traversing the nanopore channel, a process known as translocation, it decreases the ionic current momentarily because the DNA occupies part of the ionic liquid volume that carries the current. This characteristic decrease in the ionic current gives us important information about the structure of the DNA molecule.
Could this technique prove to be a time and cost efficient way of sequencing DNA on a large scale? Further investigation evolved impressively, unraveling subtle features of different DNA molecules, which were ascertained from the current vs. time electrical signals. For example, poly-A bases were found to travel an order of magnitude slower than the other bases; 3’-threaded DNA moved twice as slow as 5’-threaded DNA due to their differing tilt orientations and biological nanopores could be locally manipulated to study other complex processes such as the mixing of complementary DNA (’duplex binding’). Since DNA translocate through nanopores at 10-100 base pairs per microsecond, the entire human genome could be sequenced in 30 seconds!
DNA translocation through nanopores, therefore, seemed to be a very promising third generation method of DNA sequencing. But a major problem persisted: the DNA molecules were travelling too fast. Although we could distinguish between large strings of ‘A’ bases and ‘C’ bases, when the DNA molecule consisted of a random combination of these bases, we could not tell them apart anymore. This was mainly because the DNA was travelling at such a fast rate that the statistical fluctuations in the liquid ions passing through the pore overwhelmed the subtle differences between the four types of DNA bases. In short, single-base resolution could not be achieved.
The modern approach to solving the problem of ‘experimental noise’ replaced biological with solid-state nanopores, fabricated by etching tiny holes in silicon-based materials using a transmission electron microscope. These ‘man-made’ pores presented advantages over their biological counterparts including very high stability, control of diameter and channel length and potential for integration into electron devices. The move to solid-state pores has allowed us to uncover some significant steps towards a solution. For example, it was found that the speed of the DNA could be slowed down considerably if the salt concentration is increased, the driving voltage is decreased and the surrounding fluid viscosity is increased.
Despite the advancements that have come from this approach, there is still much to uncover before we achieve single-base resolution. Other than studying the wealth of literature on DNA translocation, my summer has largely been focused on developing my computer science skills to try and break down these current vs. time signals using solid-state pores from experiments conducted at the Drndić lab. After learning the basics of the ‘Python’ language, my project involved exploring various topics in ‘machine learning’ (a new and very engaging field in its own right) in the hope that the computer, with minimal input from my end, would be able to algorithmically identify statistically significant aspects of these signals that will help give experimenters new leads that they haven’t even considered. For instance, the DNA was first thought to go through the pores linearly but what if they are actually bending and folding? How would this affect the current signal and how would we isolate this effect from the marginal differences between the DNA bases? Clearly, the use of machine learning in such scenarios is powerful and unique. Over the last ten weeks, I have worked on much of the groundwork to implement this approach and I do hope that it will prove to be successful during the academic year.