What makes a match? For CRISPR type III-B, it depends on location

CRISPR is used by some bacteria to target and destroy DNA and RNA from invading viruses and other sources - but how do they keep from damaging their own DNA? A recent paper from our lab, published in the Journal of Biological Chemistry, shows how one type of CRISPR system distinguishes between its own DNA and others: Target sequence requirements of a type III-B CRISPR-Cas immune system.

Many CRISPR systems avoid damaging the cell’s own copy of the sequence by only cutting sequences that are flanked by a Protospacer Adjacent Motif, or PAM, which is not found in the host's copy. But Type III systems don't use PAMs. Instead, a region of the host’s DNA sequence, known as the prospacer flanking sequence, or PFS, seems to be protect the host DNA from cleavage. Type III systems are also different in their ability to tolerate more mismatches in the other sections of the sequence.

How these differences in identification work wasn’t clear – until Kaitlin Johnson, with others in the Bailey Lab, developed ways to test what was required to protect the sequence from recognition and cleavage.

To answer this question, Johnson looked at different parts of the CRISPR RNA (crRNA) that the cell uses to identify the invading sequence, and the RNA target sequence it would bind to.

As shown in the figure below, the crRNA has two main parts – the green 3’ spacer region matches the target RNA sequence, and a short 5’ crRNA tag shown in blue. The RNA target has three regions – in the middle, in light purple is the region that is complementary to the crRNA space. On either side is flanking sequence, shown in dark purple. The 3’ flank is the PFS, which lines up with the crRNA tag. If the RNA target were to have been synthesized from the host’s genome copy of the sequence, the PFS would have a complementary sequence to the crRNA tag – shown as the red anti-tag below and to the right. Because this is specific to the host’s copy of the sequence, foreign sequences should have a non-complementary sequence, as shown below and to the left in dark green.

 
crRNA and RNA target, with crRNA tag and RNA PFS to the left. 2 views: in both the crRNA sequence is AUG GAAA G. Noncomplementary box: PFS is the same sequence. Anti-tag box: PFS is UAC CUUU G, with the middle complementary to anti-tag.
 

To test out what parts of the PFS protected the DNA, and what types of mismatches were tolerated, Johnson made variations of the sequences – about 150 different sequences in total – to test out what happened: Did the DNA still get cleaved? Did the RNA target and crRNA still bind? By doing this, Johnson was able to figure out several of the rules for how the binding occurs and what is required to either activate or deactivate cleavage.

In short, what Johnson found was that the rules to turn off DNA cleavage were fairly strict – even just a short piece of the anti-tag was enough to protect an RNA target from cleavage. But within the protospacer region, the matches could be fairly loose - as long as the first segment matched, mismatches in the last five of the six segments still lead to cleavage. The figure below shows how matches or mismatches in the PFS and the first protospacer segment affect the system: a non-matching PFS and a matching first segment is activating, a matched PFS and mismatched first segment is deactivating.

 
Similar alignment as above, with 2 pairs of boxes: two are “activating:” the previous “noncomplementary” box and another with the spacer/protospacer segment 1, with paired bases. “Deactivating:” the anti-tag box and segment 1 with no base pairs.
 

The strict DNA cleavage deactivation rules makes sense given one key difference in how Type III systems compared to other CRISPR systems. While other target the foreign DNA or RNA sequences directly, Type III systems identify the RNA sequence made when RNA polymerase starts transcribing the foreign DNA, and then it cleaves both the newly synthesized RNA and the nearby DNA template. If the host’s copy was transcribed in the opposite direction of the crRNA template, it would “look” like the foreign sequence, leading to cleavage of the genome. So, while other systems use the PAM to check if the target is foreign before cutting, the Type III systems checks the PFS, and if the sequence is similar to that of the host blocks cleavage.

For full details, check out the paper. In addition to determining the general rules, Johnson also found that one of the bases in the PFS, the guanine at the -1 position, is required but is recognized not through base pairing but through an independent mechanism.