Will Intel introduce memory controllers in processors?

submit / Golovna

Optimization of work

Optimization of work Memory controller

- A digital circuit that controls the flow of data to and from the RAM. It can be an integrated microcircuit or be integrated into a larger foldable microcircuit, for example, a primary location, a microprocessor or a system on a chip. Computers like vikoryst micro

Intel processors Traditionally, the memory controller is small, memory is stored in the chipset (first place), and many current processors, such as DEC / Compaq Alpha 21364, AMD Athlon 64 and Opteron, IBM POWER5, Sun Microsystems UltraSPARC T1 and Intel Core i7 processors integration memory controller Use the same crystal to change the blocking of memory access. If you want integration to increase system productivity, you need to link the microprocessor to some type of memory that allows you to combine processors and memory of different generations.

To introduce new types of memory, it is necessary to release new processors and change the current socket (for example, after the appearance of DDR2 SDRAM, AMD released Athlon 64 processors that used the new Socket AM2 socket).

There is no integration of the memory controller with the processor

new technology

So, back in the 1990s, the DEC Alpha 21066 and HP PA-7300LC were equipped with built-in controllers to reduce system performance.

Zavdannya


The memory controller places the logical steps necessary to perform read and write operations in DRAM, and update the DRAM data storage.

  • Without periodic updates, DRAM memory chips lose information, leaving the capacitors to be discharged in order to save battery life.
  • The typical hour for reliable storage of information is parts of a second, but not less than 64 milliseconds according to JEDEC standards.

During critical periods, information is saved less frequently.

    Rich channel memory Fully buffered FB-DIMM memory Notes Wikimedia Foundation.

    2010. Counterattack of Skhidny Front information technologies in general EN memory access controllerMAC...

    Middle EOM memory- The "RAM" request is redirected here. Div. also have other meanings. The simplest scheme mutual relations

    RAM s CPU Random access memory (also random access memory, RAM) in information memory, part of the EOM memory system, in yak... Wikipedia

    Programming controller interrupt- A controller for interrupting microcircuits or using a processor block, which demonstrates the ability to successively process requests for interruption from various devices.

    The English name is Programmable Interrupt Controller (PIC). As a rule... ... Wikipedia Direct memory access- (English Direct Memory Access, DMA) mode of exchanging data between devices or between a device and the main memory (RAM) without the participation of the Central Processor (CPU). As a result, transmission speed increases, which is not ... Wikipedia programming logical controller - PLC [Intent] controller

    Fireproof device what's going on

    automatically for additional software implementation of Keruvan algorithms. [Collection of terms that are recommended. Issue 107. Management theory. Academy of Sciences of the USSR. Scientific Committee…

    Adviser of technical translation Functional controller

    - a diagram of the layout of the bridge on the system board. Southbridge (functional controller), also known as the controller of the input and output hub. I/O Controller Hub (ICH). - PLC [Intent] controller

This microcircuit is... Wikipedia USB controller - at the warehouse platformі personal computer will ensure communication with

peripheral devices

  1. , connected to the universal serial bus, the USB controller is an intelligent device that interacts effectively with ... Wikipedia
  2. Programming logical controller,
  3. - Massive programming of the logical controller of the SIMATIC S7 300 family Programmable Logic Controller (PLC) or programming of the controller is barely ... Wikipedia
  4. professional graphics controller

- The controller has 320 KB of memory. Memory is characterized by:

  1. Memory volume (KW, MB or GB).
  2. Shvidkіst chi hour zvereneniya until memory.
  3. Energy density..

Behavior after the exercise of lifeRice. 3.4 Types of memory.

(Author's little one)Prompt(memory RAM).

- random access memory

Perevaga. By myself Swedish

drive electronic memory, which is intended for short-term storage of information.

Nestacha.

The main power of this memory is energy consumption, which means that data is lost after turning off the electrical power supply. To buffer the RAM in some controllers, batteries or high-capacity electrical capacitors are used to save data electric charge

until many days. The RAM element is an electronic trigger (static memory) or electrical capacitor

(Dynamic memory).Rice. 3.4 Types of memory.

Rice. 3.5 Trigger – the main element of RAM memory

Dynamic memory requires cyclic charging of capacitors, which is cheaper than static memory.Memory matrix is itself totality

around the middle of memory - triggers.

1 row of the matrix contains 8 memory spaces (8 Bits equals 1 Byte).

Each memory center has its own unique address (row number “point” number bit).

The rows (bits) are numbered from right-hand to left from “0” to “7”.

The rows (bayti) are numbered downwards, starting with “0”.Rice. 3.4 Types of memory.

Rice. 3.6 Memory matrix (Lasting memory ROM) - read only memory Designed for careful storage of information. The main functions of RAM are those that

Save information efficiently without the hassle of life, then. є energy-free. This memory, in its own way, is divided into two types: one-time

(ROM) - and richly reprogrammed (PROM). .

Memory reprogrammed The user signs up for help from programmers. For which it is necessary to erase first (instead of memory).

The old type of memory that is being reprogrammed is brought back EPROM

- memory that is erased by ultraviolet radiation (EPROM - erasable programmable read only memoryRice. 3.7 EPROM memory erased by ultraviolet light (dzherelo http://ua.wikipedia.org/wiki/%D0%A4%D0%B0%D0%B9%D0%BB:Eprom.jpg). EEPROM

Today, the classic two-transistor EEPROM technology has practically been replaced by NOR flash memory.

However, the name EEPROM is assigned to this memory segment regardless of the technology.

Rice. 3.8 Programming flash memory.(dzherelo).

http://ua.wikipedia.org/wiki/%D0%A4%D0%B0%D0%B9%D0%BB:Flash_programming_ua.svg (Flash memory flash memory

) - a type of solid-state non-residential energy memory that is overwritten. It can be read as many times as possible (typically 10-100 cycles within the data storage term), but you can write to such memory as many times as possible (maximum - close to a million cycles). Do not remove loose parts, so in front of the administration

hard disks

, more reliable and compact.

Due to its compactness, low cost and low energy consumption, flash memory is widely used in digital portable devices.

The mental part of the controller's memory areas The controller provides access to memory areas for saving device programs, data and configurations.

Saving memory

– the memory is not independent for the programs of the koristuvach,data and configuration.

When the project is transferred to the controller, it is initially saved in the original memory. This memory is either on the memory card (as it is) or completely recalled. Information in non-energy memory is also saved during active use. The memory card supports more memory than the memory stored in the controller.


Working memory

– the memory is energy-dependent.

1. The controller copies the elements of the project from backup memory to working memory. This memory area is used when the life is turned on, and when the life is turned on the controller renews it. Memory that is saved– the memory for the limited amount of working memory is not independent. This memory serves for vibrational conservation virtual twins (triggers) are stored in the controller memory.

2. Therefore, to increase the speed of information exchange, the processor goes to the RAM for information (and not to the physical input/output terminals). The results of program processing and output are recorded cyclically. After the main voltage is switched off (the voltage drops below the critical level), important information is saved

  • back from RAM to EEPROM.
  • Areas of data that promote saving are the value of money.
  • What is called a memory matrix?
  • How many memory cells are there in one row of the memory matrix?
  • How are the sections of the memory matrix numbered (directly and range)?
  • What are the main types of controller memory (to name two types)?
  • What advantages does one type of memory have over another (two types)?
  • What types of RAM is the controller (2)?What type of memory does the memory evolve into after the frequency of programming (2)?
  • What types of reprogrammed permanent memory is divided into? for the washing method (2)? Signs show information in
  • RAM for the washing method (2)? when the controller is turned on?You know all the information from
  • with vitality
  • (If it doesn’t disappear, then where is this information saved)?

  • What is the information about the input/output terminals to the RAM called? Which memory block does the processor primarily work with? Since the appearance of processors on the Nehalem core, one of their advantages has been the integration of a three-channel memory controller.

    In our review of motherboards, we have already tried to evaluate the color quality of the multi-channel memory mode in processors running LGA1366, and the results turned out to be disappointing.

    For regimes, understandably, but not for self-interested people.

    However, the checks were carried out even on the same number of supplements, so there was no residual supply to the power supply that required a tri-channel mode in practice. We immediately decided to fill up this clearing. More precisely, for now it’s simply necessary to try not a three-channel, but a dual-channel mode, for a further equalization of productivity of the Core i7 900 and 800 series: so that later there will be no hypotheses about those that most strongly influenced the results ( what a stink, it’s effective, appear completely different). However, simply “running” the tests with the remaining version of our method in yet another configuration is tedious, and we can’t get out of such a situation with only two options, so we’ve complicated the task. Test bench configuration

    All testing was carried out using Vikoristanny

    Why does it stink?

    We need two tri-channels in order to clearly understand what is important in any addition: tri-channel and summary service? It will be clearly visible from the results: if there are 3×2, and 3×1 in the interchanges, which means that the value of three channels is the same as the first one, then the addition simply requires a lot of memory (more precisely, it is necessary to use the same memory). Without 3×1, it would be difficult to achieve an unambiguous answer.

    The importance of participating in 2x2 tests is obvious - current systems on Core 2 and AMD processors are equipped with this method, and it will soon become even more widespread for systems on LGA1156 (of course, it would be possible to protest the memory and configuration 2×1, but even so, from the looks of the systems, which do not belong to the budget sector, it is not at all good).

    1x4 looks extremely synthetic, but it’s unlikely that there are two 2 GB memory modules, they are installed in one channel, “extreme” others, prot... We need it to improve the outside illumination. The same DDR3 modules, with a capacity of 4 GB, have already appeared. It’s a pity that so far this exotica has not reached our hands (otherwise the list of tested obligatory units would have had a 2×4 version), the protema is widely expanding on the market of both such modules and kits based on them, without food o'clock.

    Detailed results of all tests, as before, are presented in the table.

    In this case, the recording is still more fun - the single-channel mode is clearly superior to the theoretical bandwidth, and the increase in the number of channels gave less than 20% in all cases.

    I, closed, blocked access.

    The obvious leader here is the dual-channel mode (we guess that on this diagram, the fewer the numbers, the better), although the single-channel access on the right does not decrease much, but in the three-channel mode the delays still increase greatly: by a quarter.

    You can now start singing the songs.

    As we remember from the behavior of other architectures with ICP (AMD K8/K10), they are most suitable for blocking access to memory, which is even noticeable in real applications.

    It’s unlikely that Nehalem will act out of the blue.

    Moreover, it’s all about reading and writing, so the dual-channel mode can become the leader.

    Single-channel is no longer a fact that it will be as fast as possible: the shading is smaller, but the bandwidth is much lower, but it cannot but be noticeable.

    And in this group there is another class of add-ons, in addition to those who need more memory and who do not care about it - those who begin to work harder with more RAM.

    At first glance, the situation is not understandable - since fluidity falls due to a lack of memory, it is easy to understand, but it is simply no one’s fault to “remember”.

    On the other hand, why is he not guilty? The effectiveness of caching may entirely depend on the volume of the reserve and is responsible for the amount of storage. Since this specific add-on requires a small amount of memory, and a permanent one at that, it “remains” a lot of memory in the processor’s cache memory. For example, with six installed gigabytes, only half of the 8 MB L3 cache will be allocated for data from “foreground” programs (do not forget what is in memory, what is lost, you can also “live”, although not very actively, but not cache for whom you apply), and for three of their services you will receive 2/3 of 8 MB. This effect, of course, is to avoid a little bit of the main topics of our investigation.

    With it, as always - in the middle, the most popular is the dual-channel mode, and with the two variants of the tri-channel, regardless of the presence of renegade additives, it is more productive than the one with the most memory.Raster graphicsBasically everything has come to an understanding, some parts of the middleraster editorsWe are listening to all three songs of the “group” of additions.
    0:09:07 0:04:45 0:08:05 0:08:12 0:17:42

    Visnovok? It doesn’t matter to those who most look at the difference between processors of different architectures (for a minority of people, the Photoshop test is simply not there, so we can say that all articles of this kind are true), that Core i7 is simply the ideal processor for Photoshop, as a matter of fact, doesn’t have anything particularly special. What is ideal here is not the kernel architecture, but the amount of memory. At 6 GB, the Core i7 920 is twice as fast as the Core 2 Quad Q9300, with less than 4 GB. The same division for most articles is narrowed down (including on our website, and other resources are similar): 3x2 for processors running LGA1366 and 2x2 for Core2, AMD Phenom, etc.

    If we are surrounded by the same 4 GB of processors (and it doesn’t matter what type they are typed), then it is clear... that the power of the Core 2 Quad is entirely within the acceptable range, based on the difference in clock frequency.

    And if we choose just one gigabyte of memory from the Core i7 (we would have given 3 or 4: a small difference), then the result will be even worse

    twice

    !

    The most ostentatious butt, prote and other evidence, is carried out in a similar manner, even microscopically, otherwise there will be a difference from time to time.

    And you don’t have to worry about anything - Photoshop effectively remembers to “love”, and the more “important” the files that are stored in the new one, the more “to love”, and all the productivity testing utilities in

    The Java machine test turned out to be very flexible to the point of reading from memory, and this is a very important task.

    Such a picture itself could be seen through the assumption that tri-channel access to memory is a guarantee of high productivity, but memory with which there is no richness.

    It’s a pity that among the protests, the evidence was confirmed literally a couple of times.

    Ale yakraz butt, if confirmed.

    Audio coding It’s a wonderful memory – we can say that the memory system lasts for days. When rendering, the stench may have been just a day, but here it’s just a day.

    An ideal benchmark for processors, which is ideal for testing the system with a spark.

    Coding video

    And the axis here is all the same as it is with the “naive theory”.

    Apparently, there is no great cost in obtaining the third channel of the memory controller in the Core i7 LGA1366.

    Channel - є, vikorystuvat - it is possible, but the results will not disappear forever.

    Therefore, you need to increase the number of modules that are being supported.

    And this is due to the number of memory controllers multiplied by the number of skin modules.

    The remainder is the addition of the number of channels that are supported to the number of modules working simultaneously on the skin channel.

    The Xeon itself requires a three-channel memory controller.

    The Opteron is required, but I didn’t get the chance to get it. Just like that, Intel didn’t bother to implement these channels. All the same, we have to go to both manufacturers, some alternative ones (and the FB-DIMM itself and the number of modules on the channel increase) one of them has already tried and is not completely satisfied.

    And all this is at the zoo, on the work table

    zvichayny koristuvach ? That's right - nothing.

    Ale!

    It’s too early to calm down - as we all know, at the saloon there was an idyll through the fact that different additives are equally important to each other, but the stench is caused in completely different ways. Who needs a lot of memory, who needs a lot of memory, who cares, who doesn’t care about important duties, but important things in life, aka DivX, in essence, “having acquired” all the objectively essential memory parameters yati and gave priority to the tri-channel mode in any - as you look. Therefore, when updating systems with different memory configurations within the same statistic (or independently), in specific tests it is not possible to forget to fail - as the same result is lost.

    However, we haven’t stopped tinkering with various configurations for too long - LGA1156, apparently, only supports two memory channels, so with these processors everything will be simple and logical.

    Devices in the LGA1366 design will continue to be tested in a 3x2 configuration, otherwise we will be able to get them from storage and 2x2 (unless it is necessary to make adjustments in the Duma on the specifics of the memory system).

    What’s important is that all processors, starting with server ones and ending with mobile ones, will be equipped with the memory controller. Of course, Intel is not only focusing on the benefits of improving the efficiency of the memory subsystem. By inserting a memory controller into the processor, it is possible to reduce delays during memory expansion, and to increase the appearance of DDR-II modules around the world in more detail

    However, we haven’t stopped tinkering with various configurations for too long - LGA1156, apparently, only supports two memory channels, so with these processors everything will be simple and logical.

    high frequencies . Due to the fact that chipset manufacturers are more likely to work with Intel representatives (third-party manufacturers will have to adapt to the capabilities of the memory controller integrated into Intel processors), the company takes more control over the manufacturing of motherboards and chipsets.

    The first signs of such a trend were already recognized by us when describing the architecture of the TwinCastle server chipset.

    The memory controller is located in this chipset in the vicinity of the microcircuit. This allows for greater design Mother's fees

    Of course, there are no miracles to be found.


    We have a Non Uniform Memory Access (NUMA) configuration, so access to memory will cost these and other overhead costs, depending on where the data is located in memory.

    It is understood that access to local memory will be subject to the lowest limitations and the highest throughput, while access to remote memory is provided through the intermediate QPI interface, which reduces productivity no. Click on images to enlarge. It’s difficult to transfer the contribution to productivity, the fragments are all stored in the program operating system.

    Intel confirms that productivity drops when remote access behind the backlogs becomes close to 70%, and the throughput of the building decreases twice as much as with local access.

    Then there is a large Level 3 cache (8 MB), which ensures communication between the cores.

    At first glance, the Nehalem cache architecture resembles Barcelona, ​​but the third-level cache robot is even subdivided from AMD - it is exclusive to all lower levels of the cache hierarchy. This means that if the kernel tries to deny access to data that is in the L3 cache, then there is no need to look for data in the head caches of other kernels - there is none there. However, as the data is present, the bits associated with the skin row of cache memory (one bit per core) show that the data can potentially be present (potentially, but without a guarantee) in the lower cache of another kernels, and so on , then in yakoma.

    This technique is very effective for ensuring the coherence of personal caches of the skin core, and also reduces the need for exchanging information between cores. And, of course, there seems to be little waste of part of the cache memory on the data that is present in the caches of other ranks. However, it’s not all that scary, the L1 and L2 cache fragments are remarkably small compared to the L3 cache - all data from the L1 and L2 caches takes up a maximum of 1.25 MB in the L3 cache out of the available 8 MB.

    As with the Barcelona version, the third-tier cache operates at different frequencies, consistent with the chip itself.

    In many cases, processors work not with physical memory addresses, but with virtual ones.

    The middle passes of the same PIDHID allowed the program more than the Pam'yati, NIZh є in the comp'yutherі, the zbergaychi lichas of the Nechedni for the moment Dan is at the FIZICHII PAM'yatі, and all the inshe - on the Zhorstkoy disc.

    This means that every memory access to a virtual address must be transferred from a physical address, and to save data, a large table must be created.

    The problem is that this table is so large that it can no longer be saved on the chip - it is located in the main memory, and it can be transferred to the hard disk (part of the table may be in memory 'yati, uploaded to HDD ).

    If for a skin operation with memory such a step would be necessary, I would transfer the address, then everything would work out perfectly.

    Optimizing unaligned memory access

    In the Core architecture, access to memory leads to a number of improvements in productivity.

    The processor is optimized for access to memory addresses, verified by 64-byte cordons, beyond the size of one cache row.

    For non-virtual data, access was not only greater, but reading and writing non-virtuous instructions was more expensive, even if the instructions were not verified, regardless of the actual verification of the memory data. i.

    The reason is that these instructions resulted in the generation of many micro-operations on the decoders, which reduced the throughput of these types of instructions.

    As a result, compilers were unique in generating instructions of this type, replacing them with a sequence of instructions, with less overhead. Thus, reading from memory, which involved overflowing two rows of cache, was approximately 12 cycles, equal to 10 cycles for writing. But in the server environment, it often led to a loss of productivity.

    There are a number of reasons for such ineffectiveness.

    First of all, access to memory is often easier to transfer from server add-ons.

    Access to the database, for example, is entirely linear - the memory is loaded with each data element, which means that the next element will be accessed. This reduces the effectiveness of the forward sampling unit. But the main problem was memory throughput in multi-socket configurations.


    We have a Non Uniform Memory Access (NUMA) configuration, so access to memory will cost these and other overhead costs, depending on where the data is located in memory.

    As we have already said before, it was already a “school place” for many processors, and in addition, the front-facing blocks led to additional popularity in this area.

    It is completely obvious that the most significant increase will be in those situations, where the main “university role” was operative memory.

    Once you have read the article in its entirety, you will notice that Intel engineers have given maximum respect to this area.

    © 2022 androidas.ru - All about Android