Linux那些事儿之我是UHCI(29)FSBR – fudan

现在让我们来关注一下fsbr.尽管之前就FSBR本身已经说过了,但是代码中出现了很多关于fsbr的变量以及函数.如果不来梳理一下,恐怕你和我一样,仍然感到无限困惑,无限茫然.那么让我们点亮心灵的阿拉丁神灯,共同穿越这代码的迷朦.

struct uhci_hcd中有这么几个成员,unsigned int fsbr_is_on,unsigned int fsbr_is_wanted,unsigned int fsbr_expiring,struct timer_list fsbr_timer,这些全都是为fsbr准备的.足以看出写代码的人对fsbr的重视.

改变fsbr_is_on的就两个函数,uhci_fsbr_on和uhci_fsbr_off,顾名思义,前者让fsbr_is_on为1,后者让fsbr_is_on为0.

改变fsbr_is_wanted的也只有两个函数,uhci_urbp_wants_fsbr和uhci_scan_schedule.同样,前者让fsbr_is_wanted为1,后者让fsbr_is_wanted为0.

改变fsbr_expiring的倒是有三处,uhci_urbp_wants_fsbr,设置为0,uhci_fsbr_timeout,也是设置为0,uhci_scan_schedule,设置为1.

而fsbr_timer作为一个计时器,它是在uhci_start中通过setup_timer做的初始化,绑定了函数uhci_fsbr_timeout(),而在uhci_scan_schedule()中调用mod_timer来引爆了这个定时炸弹.在uhci_urbp_wants_fsbr()中调用del_timer删除了这个计时器.在uhci_stop中也调用del_timer_sync来排除了这颗定时炸弹.

我们不妨先来看一下给这个定时炸弹绑定的函数究竟长成什么样?uhci_fsbr_timeout()来自drivers/usb/host/uhci-q.c:

92 static void uhci_fsbr_timeout(unsigned long _uhci)

93 {

94 struct uhci_hcd *uhci = (struct uhci_hcd *) _uhci;

95 unsigned long flags;

96

97 spin_lock_irqsave(&uhci->lock, flags);

98 if (uhci->fsbr_expiring) {

99 uhci->fsbr_expiring = 0;

100 uhci_fsbr_off(uhci);

101 }

102 spin_unlock_irqrestore(&uhci->lock, flags);

103 }

可以看到这个函数无非就是调用uhci_fsbr_off()而已,除此以外就是设置fsbr_expiring为0.而执行uhci_fsbr_off()的前提是fsbr_expiring非0.于是咱们来到uhci_scan_schedule中去看调用mod_timer的上下文.

1705 /*

1706 * Process events in the schedule, but only in one thread at a time

1707 */

1708 static void uhci_scan_schedule(struct uhci_hcd *uhci)

1709 {

1710 int i;

1711 struct uhci_qh *qh;

1712

1713 /* Don’t allow re-entrant calls */

1714 if (uhci->scan_in_progress) {

1715 uhci->need_rescan = 1;

1716 return;

1717 }

1718 uhci->scan_in_progress = 1;

1719 rescan:

1720 uhci->need_rescan = 0;

1721 uhci->fsbr_is_wanted = 0;

1722

1723 uhci_clear_next_interrupt(uhci);

1724 uhci_get_current_frame_number(uhci);

1725 uhci->cur_iso_frame = uhci->frame_number;

1726

1727 /* Go through all the QH queues and process the URBs in each one */

1728 for (i = 0; i < UHCI_NUM_SKELQH – 1; ++i) {

1729 uhci->next_qh = list_entry(uhci->skelqh[i]->node.next,

1730 struct uhci_qh, node);

1731 while ((qh = uhci->next_qh) != uhci->skelqh[i]) {

1732 uhci->next_qh = list_entry(qh->node.next,

1733 struct uhci_qh, node);

1734

1735 if (uhci_advance_check(uhci, qh)) {

1736 uhci_scan_qh(uhci, qh);

1737 if (qh->state == QH_STATE_ACTIVE) {

1738 uhci_urbp_wants_fsbr(uhci,

1739 list_entry(qh->queue.next, struct urb_priv, node));

1740 }

1741 }

1742 }

1743 }

1744

1745 uhci->last_iso_frame = uhci->cur_iso_frame;

1746 if (uhci->need_rescan)

1747 goto rescan;

1748 uhci->scan_in_progress = 0;

1749

1750 if (uhci->fsbr_is_on && !uhci->fsbr_is_wanted &&

1751 !uhci->fsbr_expiring) {

1752 uhci->fsbr_expiring = 1;

1753 mod_timer(&uhci->fsbr_timer, jiffies + FSBR_OFF_DELAY);

1754 }

1755

1756 if (list_empty(&uhci->skel_unlink_qh->node))

1757 uhci_clear_next_interrupt(uhci);

1758 else

1759 uhci_set_next_interrupt(uhci);

1760 }

可以看出,在调用mod_timer之前,我们就是设置了fsbr_expiring为1.而mod_timer设置的延时是FSBR_OFF_DELAY.这个宏的定义来自drivers/usb/host/uhci-hcd.h:

88 /* When no queues need Full-Speed Bandwidth Reclamation,

89 * delay this long before turning FSBR off */

90 #define FSBR_OFF_DELAY msecs_to_jiffies(10)

91

92 /* If a queue hasn’t advanced after this much time, assume it is stuck */

93 #define QH_WAIT_TIMEOUT msecs_to_jiffies(200)

我们看到这里两个宏被定义到了一起,凭一种男人的直觉,这两个宏应该有某种联系.实际上在struct uhci_qh中有一个成员,unsigned int wait_expired,uhci_activate_qh中把它设置为0,uhci_advance_check中则两次设置它,一次设置为0,一次设置为1.这个变量就与宏QH_WAIT_TIMEOUT相关.

不过我们还是先看前面这个宏, FSBR_OFF_DELAY,由定义可知,它代表10毫秒.按照Alan Stern大侠的想法,尽管说FSBR这个机制是一种充分利用资源的机制,但是它也在一定程度上增加了系统的负荷,所以一旦它没有被使用了就应该尽快的disable掉.根据Alan Stern的经验,如果一个URB停止使用FSBR达到10毫秒,则关掉(turn off)FSBR,理论上来说10毫秒已经足够让驱动程序提交另一个URB了.

实际上在uhci_add_fsbr()中,判断的是如果一个urb的URB_NO_FSBR这个flag没有被设置,则设置urbp->fsbr为1.实际上也没有哪位哥们儿喜欢设置URB_NO_FSBR这个flag,所以基本上我们可以认为urbp->fsbr总是会被uhci_add_fsbr()设置为1.而调用uhci_add_fsbr()的函数就两个,uhci_submit_bulk()和uhci_submit_control().所以如果我们察看debugfs文件系统的输出信息就会发现,在没有Bulk传输没有控制传输的时候,FSBR一定是0,即fsbr_is_on一定是0.而在有Bulk传输或者全速控制传输的时候,FSBR则应该是1,比如下面这个情景就出自我在写U盘的时候,这一刻我copy了一个几十兆的文件至U盘:

localhost:~ # cat /sys/kernel/debug/uhci/0000/:00/:1d.0

Root-hub state: running FSBR: 1

HC status

usbcmd = 00c1 Maxp64 CF RS

usbstat = 0000

usbint = 000f

usbfrnum = (0)958

flbaseadd = 146eb958

sof = 40

stat1 = 0095 Enabled Connected

stat2 = 0080

Most recent frame: 31a41 (577) Last ISO frame: 31a41 (577)

Periodic load table

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Total: 0, #INT: 0, #ISO: 0

这里我们看到的”FSBR”实际上对应于uhci->fsbr_is_on.

但是设置fsbr_is_on的是uhci_fsbr_on()而不是uhci_add_fsbr(),调用uhci_fsbr_on()的是uhci_urbp_wants_fsbr(),而在uhci_urbp_wants_fsbr()中需要判断urbp->fsbr,正如我们刚才说了,uhci_add_fsbr()把urbp->fsbr设置了1,所以这里uhci_fsbr_on才会被执行.

79 static void uhci_urbp_wants_fsbr(struct uhci_hcd *uhci, struct urb_priv *urbp)

80 {

81 if (urbp->fsbr) {

82 uhci->fsbr_is_wanted = 1;

83 if (!uhci->fsbr_is_on)

84 uhci_fsbr_on(uhci);

85 else if (uhci->fsbr_expiring) {

86 uhci->fsbr_expiring = 0;

87 del_timer(&uhci->fsbr_timer);

88 }

89 }

90 }

调用uhci_urbp_wants_fsbr()的有三个函数,而第一个自然是uhci_urb_enqueue().正如在uhci_urb_enqueue()中1435行看到的那样,uhci_activate_qh把qh给激活之后,就可以调用uhci_urbp_wants_fsbr来激活fsbr了.

那么在激活之后,什么时候又把FSBR设置为了0呢?也就是说把fsbr_is_on设置为0的uhci_fsbr_off什么时候被调用?事实上有两个地方,一个就是suspend_rh,一个就是uhci_fsbr_timeout.我们先来看后者,它正是前面我们说的那个定时炸弹所绑定的函数.而触发它的mod_timer函数在uhci_scan_schedule()被调用.但是要调用mod_timer须满足三个条件,uhci_scan_schedule()中1750行,这三个条件是,fsbr_is_on必须为1, fsbr_is_wanted必须为0, fsbr_expiring必须为0.第一个为1这很好理解,这也是必然的.第二个和第三个则和1738行这个uhci_urbp_wants_fsbr()有关了.对于fsbr_is_wanted,我们看到uhci_scan_schedule()中1721行首先就把它设置为了0,但是我们注意到,如果uhci_urbp_wants_fsbr执行了,就会把fsbr_is_wanted设置为1.至于fsbr_expiring,初始值就是0,也没人改过,所以它执行不执行uhci_urbp_wants_fsbr都依然是0,至少就目前这个上下文来看是这样.但问题是uhci_urbp_wants_fsbr是否被执行呢?这取决于1737行这个if判断语句.即qh->state是否等于QH_STATE_ACTIVE,而这取决于uhci_scan_qh.uhci_scan_qh的目的是看qh的urb队列是否已经空了,如果还没空,就再次调用uhci_activate_qh设置qh->state为QH_STATE_ACTIVE,如果已经空了,就调用uhci_make_qh_idle把qh->state设置为QH_STATE_IDLE.换言之,如果1737行这个if条件满足,说明qh的urb队伍里还有urb.既然有,那么就激活fsbr.即仍然设置fsbr_is_wanted为1.但是早晚有一天,qh队列会变成空的,因为传输总有结束的时候.等到那时候,uhci_scan_qh之后,qh->state就一定是QH_STATE_IDLE,所以等到那一天,uhci_urbp_wants_fsbr就不会被调用.换言之,因为没有了urb,所以我们没有必要再使用fsbr了.于是fsbr_is_wanted这次就是0.这种情况下,1752行和1753行终于有机会被执行了.这样,首先fsbr_expiring被设置为了1,其次,10毫秒之后,uhci_fsbr_timeout将被执行,从而uhci_fsbr_off也将被执行,fsbr终于停了下来,这时候我们再看debugfs,就该像下面这样,

localhost:~ # cat /sys/kernel/debug/uhci/0000/:00/:1d.0

Root-hub state: running FSBR: 0

HC status

usbcmd = 00c1 Maxp64 CF RS

usbstat = 0000

usbint = 000f

usbfrnum = (0)b04

flbaseadd = 10b57b04

sof = 40

stat1 = 0095 Enabled Connected

stat2 = 0080

Most recent frame: 384216 (534) Last ISO frame: 384216 (534)

Periodic load table

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Total: 0, #INT: 0, #ISO: 0

FSBR将再次回到0.

但是,人算不如天算,你以为一切都在掌握之中,不料,在这个10ms之内,不知哪位哥们儿缺德,又给你提交一个Bulk类型的urb,你怎么办?

不要慌,要相信党,相信政府.

假设那位哥们儿在这10ms之内提交了一个Bulk类型的urb,则uhci_urb_enqueue会被调用,因而uhci_urbp_wants_fsbr再次被调用.那么回过头去看一下uhci_urbp_wants_fsbr(),你会发现,由于刚才设置了fsbr_expiring为1,所以这个函数的85行这个else if是满足的,因此uhci->fsbr_expiring又会被设置为0,但更重要的是del_timer会被调用,即即将爆炸的炸弹在它爆炸前10ms内被英明的党排除了.相信现在你和我一样,深刻体会到党的光芒照四方了吧?

咱们刚才说到有两个宏,已经明白了其中的一个,那么另一个宏呢,即QH_WAIT_TIMEOUT .事实上它和uhci_advance_check()有关.这个函数咱们以前讲过,但是细心的你一定注意到,当时咱们跳过了它的一部分代码.现在是时候去解读这段跳过的代码了.再次贴出uhci_advance_check()来.

1626 /*

1627 * Check for queues that have made some forward progress.

1628 * Returns 0 if the queue is not Isochronous, is ACTIVE, and

1629 * has not advanced since last examined; 1 otherwise.

1630 *

1631 * Early Intel controllers have a bug which causes qh->element sometimes

1632 * not to advance when a TD completes successfully. The queue remains

1633 * stuck on the inactive completed TD. We detect such cases and advance

1634 * the element pointer by hand.

1635 */

1636 static int uhci_advance_check(struct uhci_hcd *uhci, struct uhci_qh *qh)

1637 {

1638 struct urb_priv *urbp = NULL;

1639 struct uhci_td *td;

1640 int ret = 1;

1641 unsigned status;

1642

1643 if (qh->type == USB_ENDPOINT_XFER_ISOC)

1644 goto done;

1645

1646 /* Treat an UNLINKING queue as though it hasn’t advanced.

1647 * This is okay because reactivation will treat it as though

1648 * it has advanced, and if it is going to become IDLE then

1649 * this doesn’t matter anyway. Furthermore it’s possible

1650 * for an UNLINKING queue not to have any URBs at all, or

1651 * for its first URB not to have any TDs (if it was dequeued

1652 * just as it completed). So it’s not easy in any case to

1653 * test whether such queues have advanced. */

1654 if (qh->state != QH_STATE_ACTIVE) {

1655 urbp = NULL;

1656 status = 0;

1657

1658 } else {

1659 urbp = list_entry(qh->queue.next, struct urb_priv, node);

1660 td = list_entry(urbp->td_list.next, struct uhci_td, list);

1661 status = td_status(td);

1662 if (!(status & TD_CTRL_ACTIVE)) {

1663

1664 /* We’re okay, the queue has advanced */

1665 qh->wait_expired = 0;

1666 qh->advance_jiffies = jiffies;

1667 goto done;

1668 }

1669 ret = 0;

1670 }

1671

1672 /* The queue hasn’t advanced; check for timeout */

1673 if (qh->wait_expired)

1674 goto done;

1675

1676 if (time_after(jiffies, qh->advance_jiffies + QH_WAIT_TIMEOUT)) {

1677

1678 /* Detect the Intel bug and work around it */

1679 if (qh->post_td && qh_element(qh) == LINK_TO_TD(qh->post_td)) {

1680 qh->element = qh->post_td->link;

1681 qh->advance_jiffies = jiffies;

1682 ret = 1;

1683 goto done;

1684 }

1685

1686 qh->wait_expired = 1;

1687

1688 /* If the current URB wants FSBR, unlink it temporarily

1689 * so that we can safely set the next TD to interrupt on

1690 * completion. That way we’ll know as soon as the queue

1691 * starts moving again. */

1692 if (urbp && urbp->fsbr && !(status & TD_CTRL_IOC))

1693 uhci_unlink_qh(uhci, qh);

1694

1695 } else {

1696 /* Unmoving but not-yet-expired queues keep FSBR alive */

1697 if (urbp)

1698 uhci_urbp_wants_fsbr(uhci, urbp);

1699 }

1700

1701 done:

1702 return ret;

1703 }

这里1673行以下的代码我们都没有讲过.首先wait_expired和advance_jiffies都不是第一次出现,事实上它们的赋值发生在uhci_activate_qh中,当时qh->wait_expired被设置为了0,而qh->advance_jiffies被设置为了当时的时间. QH_WAIT_TIMEOUT被定义为200毫秒,那么当我们现在执行uhci_scan_schedule的时候执行uhci_advance_check的时候,1676行,如果从qh激活到现在扫描过了200毫秒,对队列依然没有前进,按照经验,这是不正常的,这就相当于我坐公交车去上班,从百万庄大街坐319路到清华科技园,本来只要40分钟,可是如果哪天我坐了两个小时还没到,那么说明一定出问题了,要么就是出车祸了,要么就是严重堵车了.

那么这里的应对措施是什么呢?

1679行至1684行,注释说了,我家Intel的Bug,本着家丑不外扬的原则,飘过.

1686行,设置wait_expired为1.

1692行,if条件又是三个,第一,urbp不为空,第二,urbp->fsbr不为空,第三个,没有设置TD_CTRL_IOC.如果这三个条件满足,则调用uhci_unlink_qh().注释里说的很清楚,如果当前urbp->fsbr不为空,说明只要fsbr不取消,下一次还会执行到它,不妨这次先放过它.

然而如果1695行这个else里面的代码被执行了,就说明虽然队列没有前进,但是也还没有超时,即从qh激活到现在还不到200毫秒,这样如果qh的urb队列里面还有urbp,则执行uhci_urbp_wants_fsbr()以保证fsbr_is_on仍然是1.网友”做贼肾虚”不禁好奇的问:刚才我们看见,qh如果还没空,那么uhci_scan_schedule中那颗定时炸弹就不会被引爆,那么fsbr_is_on不就应该保持为1么?

然此言差矣!我们前面提过,设置fsbr_is_on为0的是函数uhci_fsbr_off(),而调用uhci_fsbr_off的除了刚才说的那个uhci_fsbr_timeout()之外,还有一个地方,它就是suspend_rh().事实上前面我们也已经看见了,suspend_rh()中最后一行就会调用uhci_fsbr_off().所以当我们从沉睡中醒来之后,我们有必要保证fsbr_is_on仍然是1.

看起来,似乎我们又看了一遍uhci_advance_check().但我们其实有一个疑问, 1692行调用uhci_unlink_qh()这个函数的后果究竟是什么?

或许是某个未开发的荒凉小岛,或许是某座闻名遐迩的文化古城。

Linux那些事儿之我是UHCI(29)FSBR – fudan

相关文章:

你感兴趣的文章:

标签云: