L2 Cache
如何启动L2 Cache(How to enable L2 cache)
See at http://www.mybatis.org/mybatis-3/sqlmap-xml.html#cache
Cache是如何工作的(How cache works)
构造步骤(Construct Steps)
We can read source code here
// Impl: PERPETUAL(永久): org.apache.ibatis.cache.impl.PerpetualCache
// Decorator: LRU(最近使用): org.apache.ibatis.cache.decorators.LruCache
org.apache.ibatis.builder.MapperBuilderAssistant#useNewCache
- Construct a new instance of
cache
using reflection. - Set individual properties using SystemMetaObject.
- Call org.apache.ibatis.builder.InitializingObject#initialize for customization[/ˌkʌstəmɪ'zeʃən/], see #816
- Add decorator cache chains by
build
- Put the
cache
into aMap<NameSpace, Cache>
If your are using a standard cache, your will get
// see org.apache.ibatis.annotations.CacheNamespace
// It may **produce dirty data** on distributed scopes.
SynchronizedCache -> LoggingCache -> LruCache -> PerpetualCache
And if you are using a customized cache, you will get
LoggingCache -> CustomCache
If you want to get a log, please override
getId
and return the id with the mapper's namespace.
Cache流程(Process flow)
By default(cacheEnabled=true
), the framework will create a CachingExecutor[/ɪg'zekjʊtə/] as a proxy(which is called the second level cache) for the database executor.
// Query -> CachingExecutor -> SimpleExecutor
org.apache.ibatis.session.Configuration#newExecutor
There is a brief process flow digram demonstrates how Mybatis caches when a query comes.
- L1 cache implementation: Java HashMap, aka LocalCache.
- L2 cache implementation: Redis.
HGET
andHSET
are commands for Redis hash data type. And id is the namespace of mapper.
sequenceDiagram
Query-->>CacheExecutor: HGET id cacheKey?
CacheExecutor-->> SimpleExecutor: HashMap.get(cacheKey)
SimpleExecutor -->> DB: select * form TABLE
activate DB
DB -->> SimpleExecutor: value
deactivate DB
CacheExecutor -->> SimpleExecutor: HashMap.put(cacheKey, value)
SimpleExecutor -->> CacheExecutor: value
Query -->>CacheExecutor: HSET id cacheKey value
CacheExecutor-->>Query: value
For more information(distribute redis lock), read my gitbook for redis
事务(TransactionalCacheManager)
In L2 cache, only put
, get
and clear
will be called despite all methods of interface are implemented.
改进Redis缓存(Redis caching Improvement)
In addition to LinkedHashMap-based LRU cache, We also use Redis for distributed caching. Of course, there is already a Jedis-based open source project called Redis-cache
However, there are some improvements to be done.
- it creates a pool on each construction, singleton instance is better.
- Doesn't support Redis sentinel mode.
- JDK-based Serializer is risky when deployed on different platform. JSON, XML or Parcelable is preferred.
- Lack of namespace for Redis.
cache:com.xx.mapper
is more maintainable and debuggabe when youDEL
keys by prefix.
Your need to fork the project and create your own cache.
Handle mutiple table with cache-ref
If there is a student with lessons, two mapper turns cache on.
<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
<cache type="REDIS"/>
<select id="selectLessonWithStudent">
SELECT s.name, s.age, l.name as L_NAME
from student s left join lesson l
on s.lesson_id = l.id
</select>
</mapper>
<!-- com.xxx.studentMapper -->
<mapper namespace="com.xxx.studentMapper">
<cache type="REDIS"/>
<update>
UPDATE student set name= #name
</update>
</mapper>
when student's name is updated, the result of selectLessonWithStudent
is not flushed, and dirty data will be fetched.
fixed by shared namespace
<!-- com.xxx.lessonMapper -->
<mapper namespace="com.xxx.lessonMapper">
- <cache type="REDIS"/>
+ <cache-ref namespace="com.xxx.studentMapper"/>
<select id="selectLessonWithStudent">
SELECT s.name, s.age, l.name as L_NAME
from student s left join lesson l
on s.lesson_id = l.id
</select>
</mapper>
When data in lesson or student updates, flushCache
will be called, and ALL cache in the namespace will be flushed, no update by special cacheKey, so the hit ratio will be explicitly lower.
// ALL cache in the same namespace will be flushed.
org.apache.ibatis.executor.CachingExecutor#flushCacheIfRequired
Concuclusion
- It's better to use cache on the only one table.
- When using cache with joined tables, use cache-ref to share namespace or turn cache off mannually.
- It's better to handle cache in business code and find your own cachekey(eg: put in Elastic as a document)
Alternative performance improvement
-
analyse SQL AST in interceptor and flush only changed changed -> It's too complex.
-
Do static analyse on XML and SQL -> It's too complex too.
APPENDIX
Voiding the risk of L1 Cache
In most situations, turning L1 cache on is risky if you have no control over the project. The two cached results may refer to the same pointer(eg: repeat queries in a for loop).
// eg: in a service
List<Student> list1 = mapper.select();
// do modification
list1.get(0).setName("Modified");
// get dirty data from cache
List<Student> list2 = mapper.select();
assert(list1 == list2)
to fix the problem
-
avoid same query in
@Transactional
, and always remove repeat queries. -
turn L1 cache off(see issue #482) and directly hit the DB.
<settings>
<!-- will flush the hashMap after the query in BaseExecutor. -->
<setting name="localCacheScope" value="STATEMENT"/>
</settings>