[go: up one dir, main page]

Menu

#3094 mds: Reassembly timer timeout causes message discarded

future
unassigned
None
defect
mds
lib
major
False
2021-09-14
2019-09-28
No

Run test: export MDS_TIPC_FCTRL_ENABLED=1; ckpttest 20 11
Test sometimes failed because the Reassembly timer timeout and discarded all fragment.

Some outlined log:
- ckptnd as a mds receiver, receives the first fragment of big message, start Reassembly timer (5 seconds hard coded)

<142>1 2019-09-28T19:41:33.372579+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477553"] MDTM: Reassembly started
<143>1 2019-09-28T19:41:33.372582+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477554"] MDTM: Recd fragmented message(first frag) with Frag Seqnum=4 SVC Seq num =3, from src Adest = <72075194378064089>
<142>1 2019-09-28T19:41:33.372585+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477555"] MDTM: User Recd msg len=65223
<143>1 2019-09-28T19:41:33.372603+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477556"] MCM_DB:RecvMessage:TimerStart:Reassemble:Hdl=0xfee00007:SrcSvcId=CPA(18):SrcVdest=65535,DestSvcHdl=562945658454033
<143>1 2019-09-28T19:41:33.372616+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477557"] MDTM: size: 65262 anc is NULL
<143>1 2019-09-28T19:41:33.37262+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477558"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32770, fseq:5], rcvwnd[acked:4, rcv:4, nacked:0]
<143>1 2019-09-28T19:41:33.372623+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477559"] MDTM: Recd message with Fragment Seqnum=4, frag_num=2, from src_id=<0x01001001:3859124441>

<143>1 2019-09-28T19:41:33.372669+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477568"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32771, fseq:6], rcvwnd[acked:4, rcv:5, nacked:0]
<143>1 2019-09-28T19:41:33.372673+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477569"] MDTM: Recd message with Fragment Seqnum=4, frag_num=3, from src_id=<0x01001001:3859124441>

  • The big message causes Tipc buffer overflow

<139>1 2019-09-28T19:41:33.384242+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477862"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32801, fseq:36], rcvwnd[acked:31, rcv:33, nacked:0], Error[msg loss]
<<..>>
<139>1 2019-09-28T19:41:33.386422+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478063"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32800, fseq:35], rcvwnd[acked:46, rcv:48, nacked:0], Error[unexpected retransmission]
<<..>>
<139>1 2019-09-28T19:41:33.386658+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478091"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32813, fseq:48], rcvwnd[acked:46, rcv:48, nacked:0], Error[unexpected retransmission]
<<..>>
<139>1 2019-09-28T19:41:33.384873+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="477905"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32814, fseq:49], rcvwnd[acked:31, rcv:33, nacked:0], Error[msg loss]
<<..>>

  • The transmission problem is resolved, but the Reassembly timer has expired, any message from sender has passed the tipc flow control with correct sequence number will be dropped at mds_dt_common

<142>1 2019-09-28T19:41:38.392219+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478200"] MDTM: Processing Timer mailbox events
<143>1 2019-09-28T19:41:38.392328+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478201"] MDTM: Tmr Mailbox Processing:Reassemble Tmr Hdl=0xfee00007

<142>1 2019-09-28T19:41:38.392623+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478206"] MSG loss not enbaled mds_mcm_msg_loss

<143>1 2019-09-28T19:41:39.380193+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478210"] MDTM: size: 65262 anc is NULL
<143>1 2019-09-28T19:41:39.380227+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478211"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32822, fseq:57], rcvwnd[acked:56, rcv:56, nacked:0]
<143>1 2019-09-28T19:41:39.380243+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478212"] MDTM: Recd message with Fragment Seqnum=4, frag_num=54, from src_id=<0x01001001:3859124441>
<143>1 2019-09-28T19:41:39.380263+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478213"] MDS_DT_COMMON : reassembly queue doesnt exist seq_num=4, Adest = <0x01001001,3859124441
<139>1 2019-09-28T19:41:39.380273+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478214"] MDTM: Some stale message recd, hence dropping Adest = <72075194378064089>

<143>1 2019-09-28T19:41:39.380309+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478215"] MDTM: size: 65262 anc is NULL
<143>1 2019-09-28T19:41:39.380322+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478216"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:32823, fseq:58], rcvwnd[acked:56, rcv:57, nacked:0]
<143>1 2019-09-28T19:41:39.380333+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478217"] MDTM: Recd message with Fragment Seqnum=4, frag_num=55, from src_id=<0x01001001:3859124441>
<143>1 2019-09-28T19:41:39.380343+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="478218"] MDS_DT_COMMON : reassembly queue doesnt exist seq_num=4, Adest = <0x01001001,3859124441

<<...>>

<143>1 2019-09-28T19:43:58.561483+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="480419"] MDTM: size: 65262 anc is NULL
<143>1 2019-09-28T19:43:58.561501+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="480420"] FCTRL: [me] <-- [node:1001001, ref:3859124441], RcvData[mseq:4, mfrag:33101, fseq:336], rcvwnd[acked:334, rcv:335, nacked:0]
<143>1 2019-09-28T19:43:58.561514+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="480421"] MDTM: Recd message with Fragment Seqnum=4, frag_num=333, from src_id=<0x01001001:3859124441>
<143>1 2019-09-28T19:43:58.561524+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="480422"] MDS_DT_COMMON : reassembly queue doesnt exist seq_num=4, Adest = <0x01001001,3859124441
<139>1 2019-09-28T19:43:58.561533+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId="480423"] MDTM: Some stale message recd, hence dropping Adest = <72075194378064089>

Discussion

  • Minh Hon Chau

    Minh Hon Chau - 2019-09-30
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,7 +1,7 @@
     Run test: export MDS_TIPC_FCTRL_ENABLED=1; ckpttest 20 11
     Test sometimes failed because the Reassembly timer timeout and discarded all fragment.
    
    -Some outlight log:
    +Some outlined log:
     - ckptnd as a mds receiver, receives the first fragment of big message, start Reassembly timer (5 seconds hard coded)
    
     &lt;142&gt;1 2019-09-28T19:41:33.372579+10:00 SC-1 osafckptnd 431 mds.log [meta sequenceId=&#34;477553&#34;] MDTM: Reassembly started
    
     
  • Gary Lee

    Gary Lee - 2019-10-21
    • Milestone: 5.19.10 --> 5.20.01
     
  • Gary Lee

    Gary Lee - 2020-02-15
    • Milestone: 5.20.02 --> 5.20.05
     
  • Gary Lee

    Gary Lee - 2020-05-30
    • Milestone: 5.20.05 --> 5.20.08
     
  • Gary Lee

    Gary Lee - 2020-08-31
    • Milestone: 5.20.08 --> 5.20.11
     
  • Gary Lee

    Gary Lee - 2020-12-01
    • Milestone: 5.20.11 --> 5.21.03
     
  • Gary Lee

    Gary Lee - 2021-03-01
    • Milestone: 5.21.03 --> 5.21.06
     
  • Gary Lee

    Gary Lee - 2021-06-01
    • Milestone: 5.21.06 --> 5.21.10
     
  • Minh Hon Chau

    Minh Hon Chau - 2021-07-02
    • status: assigned --> unassigned
     
  • Gary Lee

    Gary Lee - 2021-09-14
    • Milestone: 5.21.09 --> future
     

Log in to post a comment.