L2 Cache

如何启动L2 Cache(How to enable L2 cache)

See at http://www.mybatis.org/mybatis-3/sqlmap-xml.html#cache

Cache是如何工作的(How cache works)

构造步骤(Construct Steps)

We can read source code here

// Impl: PERPETUAL(永久): org.apache.ibatis.cache.impl.PerpetualCache
// Decorator: LRU(最近使用): org.apache.ibatis.cache.decorators.LruCache
org.apache.ibatis.builder.MapperBuilderAssistant#useNewCache

Construct a new instance of cache using reflection.
Set individual properties using SystemMetaObject.
Call org.apache.ibatis.builder.InitializingObject#initialize for customization[/ˌkʌstəmɪ'zeʃən/], see #816
Add decorator cache chains by build
Put the cache into a Map<NameSpace, Cache>

If your are using a standard cache, your will get

// see org.apache.ibatis.annotations.CacheNamespace
// It may **produce dirty data** on distributed scopes.
SynchronizedCache -> LoggingCache -> LruCache -> PerpetualCache

And if you are using a customized cache, you will get

LoggingCache -> CustomCache

If you want to get a log, please override getId and return the id with the mapper's namespace.

Cache流程(Process flow)

By default(cacheEnabled=true), the framework will create a CachingExecutor[/ɪg'zekjʊtə/] as a proxy(which is called the second level cache) for the database executor.

// Query -> CachingExecutor -> SimpleExecutor
org.apache.ibatis.session.Configuration#newExecutor

There is a brief process flow digram demonstrates how Mybatis caches when a query comes.

L1 cache implementation: Java HashMap, aka LocalCache.
L2 cache implementation: Redis. HGET and HSET are commands for Redis hash data type. And id is the namespace of mapper.

sequenceDiagram
  	Query-->>CacheExecutor: HGET id cacheKey?
	CacheExecutor-->> SimpleExecutor: HashMap.get(cacheKey)
	SimpleExecutor -->> DB: select * form TABLE
	activate DB
    DB -->> SimpleExecutor: value
    deactivate DB
    CacheExecutor -->> SimpleExecutor: HashMap.put(cacheKey, value)
    SimpleExecutor -->> CacheExecutor: value
    Query -->>CacheExecutor: HSET id cacheKey value
    CacheExecutor-->>Query: value

For more information(distribute redis lock), read my gitbook for redis

事务(TransactionalCacheManager)

In L2 cache, only put, get and clear will be called despite all methods of interface are implemented.

改进Redis缓存(Redis caching Improvement)

In addition to LinkedHashMap-based LRU cache, We also use Redis for distributed caching. Of course, there is already a Jedis-based open source project called Redis-cache

However, there are some improvements to be done.

it creates a pool on each construction, singleton instance is better.
Doesn't support Redis sentinel mode.
JDK-based Serializer is risky when deployed on different platform. JSON, XML or Parcelable is preferred.
Lack of namespace for Redis. cache:com.xx.mapper is more maintainable and debuggabe when you DEL keys by prefix.

Your need to fork the project and create your own cache.

Handle mutiple table with cache-ref

If there is a student with lessons, two mapper turns cache on.

<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
    <cache type="REDIS"/>
    <select id="selectLessonWithStudent">
        SELECT s.name, s.age, l.name as L_NAME 
        from student s left join lesson l
        on s.lesson_id = l.id
    </select>
</mapper>
<!-- com.xxx.studentMapper -->
<mapper namespace="com.xxx.studentMapper">
    <cache type="REDIS"/>
    <update>
    UPDATE student set name= #name
    </update>
</mapper>

when student's name is updated, the result of selectLessonWithStudent is not flushed, and dirty data will be fetched.

fixed by shared namespace

<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
-    <cache type="REDIS"/>
+    <cache-ref namespace="com.xxx.studentMapper"/>
    <select id="selectLessonWithStudent">
        SELECT s.name, s.age, l.name as L_NAME 
        from student s left join lesson l
        on s.lesson_id = l.id
    </select>
</mapper>

When data in lesson or student updates, flushCache will be called, and ALL cache in the namespace will be flushed, no update by special cacheKey, so the hit ratio will be explicitly lower.

// ALL cache in the same namespace will be flushed.
org.apache.ibatis.executor.CachingExecutor#flushCacheIfRequired

Concuclusion

It's better to use cache on the only one table.
When using cache with joined tables, use cache-ref to share namespace or turn cache off mannually.
It's better to handle cache in business code and find your own cachekey(eg: put in Elastic as a document)

Alternative performance improvement

analyse SQL AST in interceptor and flush only changed changed -> It's too complex.
Do static analyse on XML and SQL -> It's too complex too.

APPENDIX

Voiding the risk of L1 Cache

In most situations, turning L1 cache on is risky if you have no control over the project. The two cached results may refer to the same pointer(eg: repeat queries in a for loop).

// eg: in a service
List<Student> list1 = mapper.select();
// do modification
list1.get(0).setName("Modified");
// get dirty data from cache
List<Student> list2 = mapper.select();
assert(list1 == list2)

to fix the problem

avoid same query in @Transactional, and always remove repeat queries.
turn L1 cache off(see issue #482) and directly hit the DB.

<settings>
  <!-- will flush the hashMap after the query in BaseExecutor. -->
  <setting name="localCacheScope" value="STATEMENT"/>
</settings>