finalize、Finalizer和Finalizer Queue的原理
1.摘要
前一阵排查一个跟java的finalizer有关的问题,发现网上虽然有很多关于finalizer的描述,但是大多都语焉不详,草草说了几句“带finalize的对象会进入finalizer队列”然后就没下文了,这让我研究了很久也没搞明白这个finalizer队列究竟是什么原理,也没明白为什么heap里面的Finalizer对象非常多但是用jmap -finalizerinfo的时候总是显示为0,最后只好看jdk源代码解释这个问题。代码基于openjdk6。
2.finalize()和Finalizer的创建
首先,如果某个类Override了finalze方法的话,parse这个class时会把_has_finalizer置为TRUE。
share/vm/classfile/classFileParser.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
methodHandle ClassFileParser::parse_method(constantPoolHandle cp, bool is_interface, AccessFlags *promoted_flags, typeArrayHandle* method_annotations, typeArrayHandle* method_parameter_annotations, typeArrayHandle* method_default_annotations, TRAPS) { ...... if (name == vmSymbols::finalize_method_name() && signature == vmSymbols::void_method_signature()) { if (m->is_empty_method()) { _has_empty_finalizer = true; } else { _has_finalizer = true; } } ....... } |
在创建对象时,会在has_finalizer=true时调用register_finalizer。 share/vm/oops/instanceKlass.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
instanceOop instanceKlass::allocate_instance(TRAPS) { assert(!oop_is_instanceMirror(), "wrong allocation path"); bool has_finalizer_flag = has_finalizer(); // Query before possible GC int size = size_helper(); // Query before forming handle. KlassHandle h_k(THREAD, as_klassOop()); instanceOop i; i = (instanceOop)CollectedHeap::obj_allocate(h_k, size, CHECK_NULL); if (has_finalizer_flag && !RegisterFinalizersAtInit) { i = register_finalizer(i, CHECK_NULL); } return i; } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
instanceOop instanceKlass::register_finalizer(instanceOop i, TRAPS) { if (TraceFinalizerRegistration) { tty->print("Registered "); i->print_value_on(tty); tty->print_cr(" (" INTPTR_FORMAT ") as finalizable", (address)i); } instanceHandle h_i(THREAD, i); // Pass the handle as argument, JavaCalls::call expects oop as jobjects JavaValue result(T_VOID); JavaCallArguments args(h_i); methodHandle mh (THREAD, Universe::finalizer_register_method()); JavaCalls::call(&result, mh, &args, CHECK_NULL); return h_i(); } |
在register_finalizer里调用了finalizer_register_method,这个method指向Finalizer类的register方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
final class Finalizer extends FinalReference { private static ReferenceQueue queue = new ReferenceQueue(); private static Finalizer unfinalized = null; private Finalizer next = null, prev = null; private Finalizer(Object finalizee) { super(finalizee, queue); add(); } private void add() { synchronized (lock) { if (unfinalized != null) { this.next = unfinalized; unfinalized.prev = this; } unfinalized = this; } } /* Invoked by VM */ static void register(Object finalizee) { new Finalizer(finalizee); } } |
Finalizer对象用next和prev指针维护了双向链表,unfinalilzed变量实际是链表的表尾,并声明了一个静态变量queue,可以看到在构造函数里调用了父类的构造函数和add()
FinalReference也调用了父类的构造函数
1 2 3 4 5 |
class FinalReference<T> extends Reference<T> { public FinalReference(T referent, ReferenceQueue<? super T> q) { super(referent, q); } } |
Reference类里有一个本地成员变量queue,调用构造函数时这个变量被赋值为Finalizer的静态queue。
1 2 3 4 5 6 7 8 9 10 11 |
public abstract class Reference<T> { ReferenceQueue<? super T> queue; Reference(T referent, ReferenceQueue <? super T> queue) { this.referent = referent; this.queue = (queue == null) ? ReferenceQueue.NULL : queue; } } |
到这里可以小结一下: 1. 在创建对象时,如果对象override了finalize()方法,jvm会同时创建一个Finalizer对象 2. 所有Finalizer对象组成了一个双向链表 3. 所有Finalizer对象都有一个名为queue的成员变量,指向的都是Finalizer类的静态Queue。
3.Finalizer的销毁和Finalizer Queue
那么再看一下这些变量和queue有什么用,由于线上是old区cms回收的问题,这里就以cms为例:
先从cms回收器的入口开始,默认的cms回收器是由一个后台的thread执行的,只挑重点。 share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepThread.cpp
1 2 3 4 5 6 7 8 9 10 |
void ConcurrentMarkSweepThread::run() { ....... while (!_should_terminate) { sleepBeforeNextCycle(); if (_should_terminate) break; _collector->collect_in_background(false); // !clear_all_soft_refs } ...... } |
这里调用到了CMSCollector的collect_in_background函数: share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
void CMSCollector::collect_in_background(bool clear_all_soft_refs) { ....... switch (_collectorState) { case InitialMarking: ... break; case Marking: ... break; case Precleaning: ... break; case AbortablePreclean: ... break; case FinalMarking: { ReleaseForegroundGC x(this); VM_CMS_Final_Remark final_remark_op(this); VMThread::execute(&final_remark_op); } assert(_foregroundGCShouldWait, "block post-condition"); break; case Sweeping: ... case Resetting: // CMS heap resizing has been completed reset(true); assert(_collectorState == Idling, "Collector state should " "have changed"); stats().record_cms_end(); // Don't move the concurrent_phases_end() and compute_new_size() // calls to here because a preempted background collection // has it's state set to "Resetting". break; case Idling: default: ShouldNotReachHere(); break; } } |
在FinalMarking里执行了final_remark_op
share/vm/gc_implementation/concurrentMarkSweep/vmCMSOperations.cpp
1 2 3 4 5 6 7 8 9 10 |
void VM_CMS_Final_Remark::doit() { ..... VM_CMS_Operation::verify_before_gc(); IsGCActiveMark x; // stop-world GC active _collector->do_CMS_operation(CMSCollector::CMS_op_checkpointRootsFinal); VM_CMS_Operation::verify_after_gc(); ...... } |
真正的执行逻辑在do_CMS_operation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
void CMSCollector::do_CMS_operation(CMS_op_type op) { gclog_or_tty->date_stamp(PrintGC && PrintGCDateStamps); TraceCPUTime tcpu(PrintGCDetails, true, gclog_or_tty); TraceTime t("GC", PrintGC, !PrintGCDetails, gclog_or_tty); TraceCollectorStats tcs(counters()); switch (op) { case CMS_op_checkpointRootsInitial: { SvcGCMarker sgcm(SvcGCMarker::OTHER); checkpointRootsInitial(true); // asynch if (PrintGC) { _cmsGen->printOccupancy("initial-mark"); } break; } case CMS_op_checkpointRootsFinal: { SvcGCMarker sgcm(SvcGCMarker::OTHER); checkpointRootsFinal(true, // asynch false, // !clear_all_soft_refs false); // !init_mark_was_synchronous if (PrintGC) { _cmsGen->printOccupancy("remark"); } break; } default: fatal("No such CMS_op"); } } |
调用了checkpointRootsFinal,之后一路徘徊到ReferenceProcessor的enqueue_discovered_references:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
void CMSCollector::checkpointRootsFinal(bool asynch, bool clear_all_soft_refs, bool init_mark_was_synchronous) { ..... if (asynch) { ...... checkpointRootsFinalWork(asynch, clear_all_soft_refs, false); } else { // already have all the locks checkpointRootsFinalWork(asynch, clear_all_soft_refs, init_mark_was_synchronous); } ...... } void CMSCollector::checkpointRootsFinalWork(bool asynch, bool clear_all_soft_refs, bool init_mark_was_synchronous) { ....... { NOT_PRODUCT(TraceTime ts("refProcessingWork", PrintGCDetails, false, gclog_or_tty);) refProcessingWork(asynch, clear_all_soft_refs); } verify_work_stacks_empty(); verify_overflow_empty(); if (should_unload_classes()) { CodeCache::gc_epilogue(); } ...... } void CMSCollector::refProcessingWork(bool asynch, bool clear_all_soft_refs) { ...... if (rp->processing_is_mt()) { rp->balance_all_queues(); CMSRefProcTaskExecutor task_executor(*this); rp->enqueue_discovered_references(&task_executor); } else { rp->enqueue_discovered_references(NULL); } ...... } |
在referenceProcessor里又是一顿调用:
share/vm/memory/referenceProcessor.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
bool ReferenceProcessor::enqueue_discovered_references(AbstractRefProcTaskExecutor* task_executor) { NOT_PRODUCT(verify_ok_to_handle_reflists()); if (UseCompressedOops) { return enqueue_discovered_ref_helper<narrowOop>(this, task_executor); } else { return enqueue_discovered_ref_helper<oop>(this, task_executor); } } template <class T> bool enqueue_discovered_ref_helper(ReferenceProcessor* ref, AbstractRefProcTaskExecutor* task_executor) { ....... ref->enqueue_discovered_reflists((HeapWord*)pending_list_addr, task_executor); ....... } void ReferenceProcessor::enqueue_discovered_reflists(HeapWord* pending_list_addr, AbstractRefProcTaskExecutor* task_executor) { if (_processing_is_mt && task_executor != NULL) { // Parallel code RefProcEnqueueTask tsk(*this, _discovered_refs, pending_list_addr, _max_num_q); task_executor->execute(tsk); } else { // Serial code: call the parent class's implementation for (uint i = 0; i < _max_num_q * number_of_subclasses_of_ref(); i++) { enqueue_discovered_reflist(_discovered_refs[i], pending_list_addr); _discovered_refs[i].set_head(NULL); _discovered_refs[i].set_length(0); } } } |
在enqueue_discovered_reflists里引用的_discovered_refs类似邻接表,数组中每个元素指向一个链表,链表中每个节点是一个需要被回收掉的对象。
最后会来到这个函数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
void ReferenceProcessor::enqueue_discovered_reflist(DiscoveredList& refs_list, HeapWord* pending_list_addr) { ..... oop obj = NULL; oop next_d = refs_list.head(); while (obj != next_d) { obj = next_d; assert(obj->is_instanceRef(), "should be reference object"); next_d = java_lang_ref_Reference::discovered(obj); if (TraceReferenceGC && PrintGCDetails) { gclog_or_tty->print_cr(" obj " INTPTR_FORMAT "/next_d " INTPTR_FORMAT, obj, next_d); } assert(java_lang_ref_Reference::next(obj) == NULL, "The reference should not be enqueued"); if (next_d == obj) { // obj is last // Swap refs_list into pendling_list_addr and // set obj's next to what we read from pending_list_addr. oop old = oopDesc::atomic_exchange_oop(refs_list.head(), pending_list_addr); // Need oop_check on pending_list_addr above; // see special oop-check code at the end of // enqueue_discovered_reflists() further below. if (old == NULL) { // obj should be made to point to itself, since // pending list was empty. java_lang_ref_Reference::set_next(obj, obj); } else { java_lang_ref_Reference::set_next(obj, old); } } else { java_lang_ref_Reference::set_next(obj, next_d); } java_lang_ref_Reference::set_discovered(obj, (oop) NULL); } ..... |
enqueue_discovered_reflist函数把所有节点的next指向自己,并把节点插入到pending_list_add的位置,这个pending_list_addr是jvm硬编码写死的,定义在:
share/vm/classfile/javaClasses.cpp
1 2 3 4 5 6 7 8 9 10 11 |
void JavaClasses::compute_hard_coded_offsets() { const int x = heapOopSize; java_lang_ref_Reference::static_pending_offset = java_lang_ref_Reference::hc_static_pending_offset * x; //hc_static_pending_offset=1 } HeapWord *java_lang_ref_Reference::pending_list_addr() { instanceKlass* ik = instanceKlass::cast(SystemDictionary::Reference_klass()); address addr = ik->static_field_addr(static_pending_offset); // XXX This might not be HeapWord aligned, almost rather be char *. return (HeapWord*)addr; } |
查看java.lang.ref.Reference类也找到了这个定义:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
public abstract class Reference<T> { private T referent; /* Treated specially by GC */ ReferenceQueue<? super T> queue; Reference next; transient private Reference<T> discovered; /* used by VM */ static private class Lock { }; private static Lock lock = new Lock(); /* List of References waiting to be enqueued. The collector adds * References to this list, while the Reference-handler thread removes * them. This list is protected by the above lock object. */ private static Reference pending = null; } |
在Reference内部启动了一个线程,用来处理这个pending list:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
private static class ReferenceHandler extends Thread { ReferenceHandler(ThreadGroup g, String name) { super(g, name); } public void run() { for (;;) { Reference r; synchronized (lock) { if (pending != null) { r = pending; Reference rn = r.next; pending = (rn == r) ? null : rn; r.next = r; } else { try { lock.wait(); } catch (InterruptedException x) { } continue; } } // Fast path for cleaners if (r instanceof Cleaner) { ((Cleaner)r).clean(); continue; } ReferenceQueue q = r.queue; if (q != ReferenceQueue.NULL) q.enqueue(r); } } } |
并且这个线程的级别是最高的:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
static { ThreadGroup tg = Thread.currentThread().getThreadGroup(); for (ThreadGroup tgn = tg; tgn != null; tg = tgn, tgn = tg.getParent()); Thread handler = new ReferenceHandler(tg, "Reference Handler"); /* If there were a special system-only priority greater than * MAX_PRIORITY, it would be used here */ handler.setPriority(Thread.MAX_PRIORITY); handler.setDaemon(true); handler.start(); } |
线程会把pending对象所指的reference移出链表,如果对象的queue不是空,则把对象放到queue中。对于finalizer对象来说,这个queue是之前提到的finalizer类的静态变量queue,在Finailzer类中也有一个对应的处理线程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
private static class FinalizerThread extends Thread { private volatile boolean running; FinalizerThread(ThreadGroup g) { super(g, "Finalizer"); } public void run() { if (running) return; running = true; for (;;) { try { Finalizer f = (Finalizer)queue.remove(); f.runFinalizer(); } catch (InterruptedException x) { continue; } } } } |
4.总结
最后总结一下finalizer的生存周期:
- 在创建对象时,如果对象override了finalize()方法,jvm会同时创建一个Finalizer对象
- 所有Finalizer对象组成了一个双向链表
- 所有Finalizer对象都有一个名为queue的成员变量,指向的都是Finalizer类的静态Queue。
- cms gc执行到mark阶段的最后时,会把需要gc的对象加入到Reference的pending list中。
- 有一个专门的高级别线程Reference Handler处理pending list,把pending list中的对象取出来,放到这个对象所指的Reference Queue中,对于Finalizer对象来说,这个queue指向Finalizer类的静态Queue。
- Finalizer类有一个专门的线程负责从queue中取对象,并且执行finalizer引用的对象的finalize函数。
jvm的代码还是非常复杂的,感觉这次看的还是太粗略,会有不少疏漏,过段时间得抽空完整的了解一下jvm源代码。
本作品采用知识共享署名-非商业性使用 4.0 国际许可协议进行许可,转载请注明作者及原网址。
你好,很感谢的你这篇文章。虽然是几年前的,但对我帮助依旧很大~ 恩,对于文章,我有两次疑问:①【在创建对象时,如果对象override了finalize()方法,jvm会同时创建一个Finalizer对象】关于这点,是否还应该要满足“finalize()方法为非空方法”,从文中【
if (m->is_empty_method()) {
_has_empty_finalizer = true;
} else {
_has_finalizer = true;
}
】也看出应该是非空的finalize()方法?
②【Finalizer对象用next和prev指针维护了双向链表,unfinalilzed变量实际是链表的表尾】这里unfinalilzed是否应该表示的是链表的表头??从代码【
private void add() {
synchronized (lock) {
if (unfinalized != null) {
this.next = unfinalized;
unfinalized.prev = this;
}
unfinalized = this;
}
}
】假设有A、B、C三个对象依次加入Finalizer链表中,那么加入的后的结果为 C B A;并且unfinalized最后指向了C,所以unfinalized始终指向了表头?
我觉得楼上的兄弟的两个问题是对的,f类必须是实现并且非空,第二个问题应该是表头才对