what is the right way to update MMU translation table
Asked Answered
F

1

3

I enabled MMU on my s3c2440 board (3G - 4G memory :: the fault attribute),everything was just fine when I didn't read/write 3G - 4G memory .So to test the page fault vector ,I wrote to a 0xFF to the 3G address,as I expected ,I got the right value from FSR ,So I did this in _do_page_fault (), the step was like this :

.....                 // set new page to translation table 
.....
invlidate_icache ();  // clear icache 
clr_dcache ();        // wb is used ,clear dcache 
invalidate_ttb ();    // invalidate translation table 

and then ISR_dataabort returned ,I read the 3G address to get the 0xFF which I wote before . Unfortunately I got a data abort again .(I am sure the translation table value I set is ok)

So what is the correct way to update the MMU translation table .Any help is appreciate ! thanks

here is the Main code I used (just for some test), (I am a littlte strange to ARM ARCH,so this code might be urgly )

/* MMU TTB 0 BASE ATTR */
#define TTB0_FAULT          (0|(1<<4))          /* TTB FAULT */
#define TTB0_COARSE         (1|(1<<4))          /* COARSE PAGE BASE ADDR */
#define TTB0_SEG            (2|(1<<4))          /* SEG BASE ADDR */
#define TTB0_FINE           (3|(1<<4))          /* FINE PAGE BASE ADDR */

/* MMU TTB 1 BASE ATTR */ 
#define TTB1_FAULT          (0)
#define TTB1_LPG            (1)                 /* Large page */
#define TTB1_SPG            (2)                 /* small page */
#define TTB1_TPG            (3)                 /* tiny page */

/* domain access priority level */
#define FAULT_PL            (0x0)               /* domain fault */
#define USR_PL              (0x1)               /* usr mode */
#define RSV_PL              (0x2)               /* reserved */
#define SYS_PL              (0x3)               /* sys mode */

#define DOMAIN_FAULT        (0x0<<5)            /* fault 0*/
#define DOMAIN_SYS          (0x1<<5)            /* sys 1*/
#define DOMAIN_USR          (0x2<<5)            /* usr 2*/

/* C,B bit */
#define CB                  (3<<2)              /* cache_on, write_back */
#define CNB                 (2<<2)              /* cache_on, write_through */ 
#define NCB                 (1<<2)              /* cache_off,WR_BUF on */
#define NCNB                (0<<2)              /* cache_off,WR_BUF off */

/* ap 2 bits */
#define AP_FAULT            (0<<10)             /* access deny */
#define AP_SU_ONLY          (1<<10)             /* rw su only */
#define AP_USR_RO           (2<<10)             /* sup=RW, user=RO */
#define AP_RW               (3<<10)             /* su=RW, user=RW */

/* page dir 1 ap0 */
#define AP0_SU_ONLY         (1<<4)             /* rw su only */
#define AP0_USR_RO          (2<<4)             /* sup=RW, user=RO */
#define AP0_RW              (3<<4)             /* su=RW, user=RW */

/* page dir 1 ap1 */
#define AP1_SU_ONLY         (1<<6)             /* rw su only */
#define AP1_USR_RO          (2<<6)             /* sup=RW, user=RO */
#define AP1_RW              (3<<6)             /* su=RW, user=RW */


/* page dir 1 ap2 */
#define AP2_SU_ONLY         (1<<8)             /* rw su only */
#define AP2_USR_RO          (2<<8)             /* sup=RW, user=RO */
#define AP2_RW              (3<<8)             /* su=RW, user=RW */

/* page dir 1 ap3 */
#define AP3_SU_ONLY         (1<<10)             /* rw su only */
#define AP3_USR_RO          (2<<10)             /* sup=RW, user=RO */
#define AP3_RW              (3<<10)             /* su=RW, user=RW */


#define RAM_START           (0x30000000)
#define KERNEL_ENTRY        (0x30300000)        /* BANK 6 (3M) */
#define KERNEL_STACK        (0x3001A000)        /* BANK 6 (16K + 64K + 16K + 8K) (8k kernel stack) */
#define IRQ_STACK           (0x3001B000)        /* 4K IRQ STACK */
#define KERNEL_IMG_SIZE     (0x20000)            

#define IRQ_STACK           (0x3001B000)

/* 16K aignment */
#define TTB_BASE            (0x30000000)        
#define PAGE_DIR0           (TTB_BASE)
#define TTB_FULL_SIZE       (0x4000)        
#define PAGE_DIR1           (TTB_BASE+TTB_FULL_SIZE)                 

#define PAGE_DIR0_SIZE      (0x4000)            /* 16k */


    void _do_page_fault (void)
    {
        //
        ...........
        // 
        // read the FSR && get the vaddr && type here 
    volatile unsigned *page_dir = (volatile unsigned*)(TTB_BASE);  
    unsigned index = vaddr >> 20,i = 0, j = 0;
    unsigned page = 0;

        if (!(page_dir[index] & ~(0x3FF) && (type == 0x0B))) {          /* page_dir empty */
            i = index & ~0x03;                                          
            if ( (page_dir[i+0] & ~(0x3FF)) || (page_dir [i+1] & ~(0x3FF))         
              || (page_dir[i+2] & ~(0x3FF)) || (page_dir [i+3] & ~(0x3FF)) )
            {
                panic ( "page dir is bad !\n" );                        /* 4 continuous page_dir must be 0 */
            }

            if (!(page = find_free_page ())) 
                panic ( "no more free page !\n" );                      /* alloc a page page dir*/

            page_dir[i+0] = (page + 0x000) | DOMAIN_USR | TTB0_COARSE ; /* small page 1st 1KB */
            page_dir[i+1] = (page + 0x400) | DOMAIN_USR | TTB0_COARSE ; /* small page 2nd 1KB */
            page_dir[i+2] = (page + 0x800) | DOMAIN_USR | TTB0_COARSE ; /* small page 3rd 1KB */
            page_dir[i+3] = (page + 0xC00) | DOMAIN_USR | TTB0_COARSE ; /* small page 4th 1KB */

            if (!(page = find_free_page ())) 
                panic ( "no more free page !\n" );                      /* alloc a page page table*/

            volatile unsigned *page_tbl = (volatile unsigned*) (page_dir[index] & ~(0x3FF));

            *page_tbl = page|AP0_RW|AP1_RW|AP2_RW|AP3_RW| NCNB|TTB1_SPG;/* small page is used */


            invalidate_icache ();

            for (i = 0; i < 64; i++)
            {
                for (j = 0;j < 8;j ++)
                    clr_invalidate_dcache ( (i<<26)|(j<<5) );
            }


            invalidate_tlb ();
        }
        ........
        //

    }

    /* here is the macros */

    #define invalidate_tlb()    \
    {\
         __asm__  __volatile__ (\
            "mov    r0,#0\n"\
            "mcr    p15,0,r0,c8,c7,0\n"\
            :::"r0" \
        );\
    }

    #define clr_invalidate_dcache(index)    \
    {\
        __asm__  __volatile__ (\
            "mcr    p15,0,%[i],c7,c14,2\n"\
            :: [i]"r"(index)\
        );\
    }


    #define invalidate_icache() \
    {\
        __asm__  __volatile__ (\
            "mov    r0,#0\n"\
            "mcr    p15,0,r0,c7,c5,0\n"\
            ::: "r0"\
        );\
    }

    #define invalidate_dcache() \
    {\
        __asm__  __volatile__ (\
            "mov    r0,#0\n"\
            "mcr    p15,0,r0,c7,c6,0\n"\
            ::: "r0"\
        );\
    }



    #define invalidate_idcache()    \
    {\
        __asm__  __volatile__ (\
            "mov    r0,#0\n"\
            "mcr    p15,0,r0,c7,c7,0\n"\
            :::"r0"\
        );\
    }\
Flowing answered 5/5, 2013 at 9:7 Comment(4)
Please provide actual code used. A short bit of misspelt pseudocode without any information about what operating system you are using does not provide the faintest clue as to what your problem might be. Also, the fact that you are sure about your translation table entry doesn't mean it is correct, so please include that too.Millepede
The code I used is posted above ,I don't know why my s3c2440 halt when called invalidate_tbl (),can you help me ,thanks !Flowing
If you don't want to show your defines, you could show hex dumps of the L1 and L2 table entries just before aborts. Be sure to provide offsets versus TTB_BASE and L2 bases. If you do this, you may answer your own question.Sailfish
all the #defines are posted now.thanks for helping me ^_^Flowing
S
4

Note: I assume TTB_BASE is the primary ARM L1 page table. If it is some shadow, you need to show more code as per unixsmurf. Here is my best guess...

Your page_dir is functioning as both the primary L1 entries and as the L2 fine page table. The TTB_BASE should only contain sections, super sections or pointers to sub-page tables. You need to allocate more physical memory for the L2 page tables.

I guess your page_dir[i+0], page_dir[i+1], etc are overwriting other L1 section entries and Your invalidate_tlb() make this concrete to the CPU. You should be using the L2 page_tbl pointer to set/index the small/fine pages. Perhaps your kernel code is in the 3G-4G space and your are over-writing some critical L1 mappings there; any number of strange things could happen. The current use of page_tbl is unclear.

You can not double use the primary L1 tables as they have meaning to hardware. I guess that this is just a mistake?

There are only three types of entries in the primary L1 page tables,

  1. Sections - a 1MB memory mapping straight virt to phys, with no 2nd page table.
  2. Super-sections - a 4MB memory mapping as per sections, but minimizing TLB pressure.
  3. A 2nd page table. Entries can be 1k,4k,and 64k. Fine page tables are 4k in table size and are not in modern ARM designs. Each entry is 1k of address in a fine page table.

The primary page table must be on a physical address that is 16k aligned, which is also it's size. 16k/(4bytes/entry)*1MB gives a 4GB address range of the L1 tables. So each entry in the primary L1 page table always refers to a 1MB entry. Coarse page tables are the only option on newer ARMs and they refer to a 1K L2 table.

1K_L2_size * 4K_entry / (4bytes_per_entry) gives 1MB address space.
1K_L2_size * 64K_entry / (16bytes_per_entry) gives 1MB address space.

There are four entries of 1MB each in the primary L1 for a super-section. There are four entries each for a 64k large page in the L2 table. If you are using super-sections, you do not have L2 entries.

I think you may have Super-sections and large pages mixed up? For certain having improperly formatted page tables will only manifest on a TLB invalidate, so that MMU mappings are re-fetched from the tables via a walk.

Finally, you should flush the D cache and drain the write buffer to ensure consistency with memory. You may wish to have a memory barrier as well.

static inline void dcache_clean(void)
{
    const int zero = 0;
    asm volatile ("" ::: "memory"); /* barrier */
    /* clean entire D cache -> push to external memory. */
    asm volatile ("1: mrc p15, 0, r15, c7, c10, 3\n"
                    " bne 1b\n" ::: "cc");
    /* drain the write buffer */
    asm volatile ("mcr 15, 0, %0, c7, c10, 4"::"r" (zero));
}

There are also co-processor commands to invalidate single TLB entries as you are only changing a portion of the vaddr/paddr mappings in the data fault.

See also: ARM MMU tutorial, Virtual Memory structures, and your ARM Architecture Reference Manual.

Sailfish answered 5/5, 2013 at 14:41 Comment(9)
yes,TTB_BASE is Level 1 page_table which cost 16KB phys address .The situation is that I want to use section map 0x0000 - 0x30000000 && 0x40000000 - 0xA0000000,and use coarse page map (cost 1KB), 0x30000000 - 0x34000000 (SDRAM = 64M).for the reason that L2 (coarse page table costs 1KB ,I let 4 continuous 4 L2 page tables use the same page.that's why I use 'i = index & ~0x03' .Flowing
in L1 table, i = index & ~3 to get the 4M alignment index ,in L1 cost 4 * 4Byte ,when one of the 4M's page dir is empty, all the L1 page dir in "index & ~3" enties should ptr to the same page *page_tbl means I used page_tbl[0],may be it is wrong when vaddr is not 4M alignment,but the 3G address is just fine .Flowing
one L1 page_table takes 4 bytes ,it ptr to 1MB whatever the type is.When I use coarse page,L1 page_table ptr to a 1KB phys mem,but find_free_page () returns 4KB page,So I let the L1 4-continuous entries ptr to the same page,and if vaddr is 3M,then the L1 entry of 3M (4K page)is the same as 0M,1M,2M. "i = index & ~3" get the 4M alignment entry.Flowing
I mean the L1 entry OF 3M vaddr takes just the same phys page as 0M,1M,2M...but their offset is different,0M has 0 - (0x3FF),1M has 0X400 - 7ff) ......Flowing
It's ok now after I turned off the mmu and icache ,dcache too.and then invalidate i & d cache ,at the end I turned on mmu i d cache again ... Is it the right way to update MMU translation table ??? though it works, IS IT COST TOO MUCH CPUs ???? ANY ONE KNOWS ???Flowing
No, that is not needed, but is good information to add. I put some additional code in my answer that may help.Sailfish
thanks a lot ^_^.I read ARM_ARM now ,I have got the correct way to update MMU table : 1. cleaning the data cache if it is a write-back cache 2. invalidating the data cache 3. invalidating the instruction cache 4. draining the write buffer.Flowing
It is odd to invalidate the icache. I guess this is for stale cache that is in the new TLB entry. As it was a data fault, you could assume that it was not in the icache, but safe is better than sorry.Sailfish
Now,i don't need to turn off mmu ,I just put the code: clr_all_dcache (); //dcache_clean (); invalidate_dcache (); __asm__ __volatile__ ("mcr p15,0,%0,c8,c6,1"::"r"(vaddr));Flowing

© 2022 - 2024 — McMap. All rights reserved.