Short-term memory (or "primary" or "active memory") is the capacity for holding, but not manipulating, a small amount of information in mind in an active, readily available state for a short period of time. The duration of short-term memory (when rehearsal or active maintenance is prevented) is believed to be in the order of seconds. The most commonly cited capacity is The Magical Number Seven, Plus or Minus Two (which is frequently referred to as Miller's Law), despite the fact that Miller himself stated that the figure was intended as "little more than a joke" (Miller, 1989, page 401) and that Cowan (2001) provided evidence that a more realistic figure is 4±1 units. In contrast, long-term memory can hold an indefinite amount of information.
Short-term memory should be distinguished from working memory, which refers to structures and processes used for temporarily storing and manipulating information (see details below).
The idea of the division of memory into short-term and long-term dates back to the 19th century. A classical model of memory developed in the 1960s assumed that all memories pass from a short-term to a long-term store after a small period of time. This model is referred to as the "modal model" and has been most famously detailed by Shiffrin. The exact mechanisms by which this transfer takes place, whether all or only some memories are retained permanently, and indeed the existence of a genuine distinction between the two stores, remain controversial topics among experts.
One form of evidence, cited in favor of the separate existence of a short-term store comes from anterograde amnesia, the inability to learn new facts and episodes. Patients with this form of amnesia, have intact ability to retain small amounts of information over short time scales (up to 30 seconds) but are dramatically impaired in their ability to form longer-term memories (a famous example is patient HM). This is interpreted as showing that the short-term store is spared from amnesia and other brain diseases.
Other evidence comes from experimental studies showing that some manipulations (e.g., a distractor task, such as repeatedly subtracting a single-digit number from a larger number following learning; cf Brown-Peterson procedure) impair memory for the 3 to 5 most recently learned words of a list (it is presumed, still held in short-term memory), while leaving recall for words from earlier in the list (it is presumed, stored in long-term memory) unaffected; other manipulations (e.g., semantic similarity of the words) affect only memory for earlier list words, but do not affect memory for the last few words in a list. These results show that different factors affect short-term recall (disruption of rehearsal) and long-term recall (semantic similarity). Together, these findings show that long-term memory and short-term memory can vary independently of each other.