A contig (from contiguous) is a set of overlapping DNA segments that together represent a consensus region of DNA. In bottom-up sequencing projects, a contig refers to overlapping sequence data (reads); in top-down sequencing projects, contig refers to the overlapping clones that form a physical map of the genome that is used to guide sequencing and assembly. Contigs can thus refer both to overlapping DNA sequence and to overlapping physical segments (fragments) contained in clones depending on the context.
A sequence contig is a continuous (not contiguous) sequence resulting from the reassembly of the small DNA fragments generated by bottom-up sequencing strategies. This meaning of contig is consistent with the original definition by Rodger Staden (1979). The bottom-up DNA sequencing strategy involves shearing genomic DNA into many small fragments ("bottom"), sequencing these fragments, reassembling them back into contigs and eventually the entire genome ("up"). Because current technology allows for the direct sequencing of only relatively short DNA fragments (300–1000 nucleotides), genomic DNA must be fragmented into small pieces prior to sequencing. In bottom-up sequencing projects, amplified DNA is sheared randomly into fragments appropriately sized for sequencing. The subsequent sequence reads, which are the data that contain the sequences of the small fragments, are put into a database. The assembly software then searches this database for pairs of overlapping reads. Assembling the reads from such a pair (including, of course, only one copy of the identical sequence) produces a longer contiguous read (contig) of sequenced DNA. By repeating this process many times, at first with the initial short pairs of reads but then using increasingly longer pairs that are the result of previous assembly, the DNA sequence of an entire chromosome can be determined.
Today, it is common to use paired-end sequencing technology where both ends of consistently sized longer DNA fragments are sequenced. Here, a contig still refers to any contiguous stretch of sequence data created by read overlap. Because the fragments are of known length, the distance between the two end reads from each fragment is known. This gives additional information about the orientation of contigs constructed from these reads and allows for their assembly into scaffolds.