IT3F Table Of Data Sets                                                                       IT3F Documentation

FamilypFAM Clan IDpFAM IDHMM Length (a.a.)1Number Of Proteins (Category 1 (Category 2, Category 3)2)Synonyms For Some Famous Genes In the Family
(Match States) Arabidopsis (At) Rice (Os) Brachypodium (Bd) Lotus (Lj) Moss3 (Pp)

Helix-turn-helix motifs:CL0123
MYB family:PF00249
  (R1)R2R3MYB5(R2R3 only) 98 136 130 81 140 56 PMG COV WER GL1 TDF TT2 RAX PFG PAP DUO FOUR LIPS FLP PHAN AS HAG MYB3R-
  MYB1R5 (SHAQKY (e.g. DIV, LHY), GARP) 58 88 90 75 70 65 PHR APL UNE GLK2 ARR GPRI KAN PCL GPRI PRR RVE LHY EPR CCA LCL ADA2A SWI ATRL ALY
  MYB (summary of all subfamilies4)60 37 (9, 2) 11 (3, 3) 0 (0, 0) 20 (6, 3) CAPRICE CPL TRB ETC TRY TCL EPR PIE TRF GPRI ADA2A
HD (homeodomain) CL0123 PF00046 57 97 97 91 73 47 ANL HDG GL PDF ATML LMI WOX PHV REV HAT PHB FWA PRS WUS BEL1 BLH SAW PNF EDA KNAT STM LSN PRHA
ZF-HD5 PF04770 62 17 16 15 19 10 HD MEE MIF
HSF CL0123 PF00447 93 24 25 24 18 8 RHA
E2F-DP PF02319 8 7
Trihelix PF10545 26 20
Helix-loop-helix motifs:
bHLH PF00010 59 171 181 118 106 CIB PIF ATAIB PIL ALC UNE ZCW32 HEC BEE ATMYC EGL3 AMS HFR1 FMA BIM GL3 TT8 FIT NAI ICE MUTE DYT SPCH RGE ILR LHW SAC51 MEE8 KDR PRE PAR ALC
TCP5 PF03634 56 24 22 21 21 6 MEE CYCLOIDEA PTF BRC CYC TB1
Zinc finger(-like) motifs:
C2H2 CL0361
(classical C2H2 and C2HC)
PF00096 24 120 128 110 115 73 IDD !C3HC4 !RING finger STOP JACKDAW AZF ZFP REF SUPERMAN SGR ELF TT DOT HDT SUF HAM PUX RHL ZAT TAC JAGGED HDA NUBBIN MBD FIS EMB PHS PRMT
DOF5 PF02701 63 36 31 28 23 23 CDF DAG OBP
GATA (C2C2 type) CL0167
(zinc beta ribbon)
PF00320 36 34 27 27 19 16 MONOPOLE CGA BME ZML
LIM PF00412 13 12
PHD CL0390 PF00628 54 86 84 81 55 82 MBD ATX SGT HAT3 ING EMB AL SHL XR ORC1[A-D] SDG MMD SIZ MS VIM ORTH HAC VIN ELP EDM
TAZ (putative zinc finger) PF02135 9 3
WRKY5 CL0274 PF03106 55 73 103 75 74 37 ZAP TTG MSP MEE TTR RRS MAPKKK
Other large TF families:
AP2/ERF family5: CL0081
(MBD-like)
PF00847
  AP2-like 169 18 25 24 22 12 AIL_PTL_BBM_PLT_ANT_WRI_APETALA-2_TOE_SMZ_SNZ RAP2.7
  ERF 59 128 142 106 133 149 CRF RRTF ABR ORA TINY SHN CEJ DRNL ORA HRD DDF RAV TEM ANT PLT
B35 CL0405
(pseudo-barrel domain)
PF02362 43 98 92 71 51 51 RAV TEM NGA ABI MEE VRN FUS REM RTV LEC HSI HSL ARF VAL NPH IAA MONOPTEROS ETT
bZIP CL0018 PF07716 52 77 95 81 45 42 FD GBF ABF ABI VIP HY TGA OBF UNE HDG CHUP
CCAAT (HAP) CL0012
Histone Superfamily
PF00808 84 27 29 29 19 HAP LEC1 HTA EB MGH HIS
MADS PF00319 75 105 74 52 88 21 SEP SOC SHP ANR AP TT CAL FLC FLOWERING LOCUS C MAF4 PHE SVP
NAC5 PF02365 138 108 139 75 115 32 CUC NTL NARS VND RD NARS NAP NST NTM SND XND
Other Small TF families:
ARF5 (Auxin Response Factor) PF06507 10     10    
ARID PF01388 10     10    
AUX-IAA5 PF02309 28 30
BBR-BPC5 PF06217 7 4
BES15 PF05687 8 4
EIL PF04873 6 7
FHA CL0357
(SMAD / FHA)
PF00498 16 14
GeBP5 PF04504 21 6
GRAS5 PF03514 331 33 59 45 55 43 PAT SCL RGL SGR RGA GAI SCARECROW
JumonC CL0029
(cupin)
PF02373 17 16
MBF1 PF08523 78 3 2 4 5 3 MBF
PLATZ5 PF04640 10 10
S1Fa5 PF04689 71 4 1 1 5 3
SBP5 PF03110 79 17 19 18 16 14
SRS5 PF05142 10 9

They are not all here! Other TF families
can be added on request - email me!
Other small TF families: Alfin, AT-hook, AS2, bHSH, C2C2(Zn)-CO, C3H-type 1 (Zn),
CAMTA, CBF5, CCAAT, CPP(Zn), CSD, DBP, DTT, GIF, GRF, HRT, LUG, Nin-like,
NOZZLE (NZZ), SET (PcG), RB, SAP, Sir2, Sigma70-like, SNF, SW13, Swi, TUB,
ULT, VOZ, VIP3, Whirly, ZIM

Non-TF families that contain proteins encoding
enzymes of secondary metabolism:
BAHD-AT CL0149 PF02458 437 56 (6, 3) 113 (33, 0) 82 (15, 0) 57 (44, 0) CHAT HCT SHT AT5MAT SDT EMB AACT EPS CER
CHI PF02431 198 6 6 6 6 6 TT5
Epimerase CL0063 PF01370 261 62 (1, 0) 78 (3, 1) 64 (4, 1) 35 (13, 2) 0 (0, 0) MUR GER GMD UGE RHD HSR AXS GAE SQD CCR BAN DFR TT
NmrA CL0063 PF05368 308 18 (0, 1) 21 (5, 1) 17 (1, 0) 17 (9, 1) 0 (0, 0) PRR PCB
OG-Fe CL0029
(cupin)
PF03171 105 117 (1, 1) 107 (0, 0) 92 (1, 2) 117 (4, 0) 0 (0, 0) GA\d{1,}OX DMR SRG ACO TT FLS AOP
P450 PF00067 503 226 (18, 1) 0 (0, 0) 0 (0, 0) 104 (92, 4) 0 (0, 0) C4H REF TRANSPARENT TESTA UNE FAH SUR THAD PAD BAS LUT GA REQUIRING 3 BUS EMB KAO
SDR CL0063 PF00106 175 87 (3, 0) 96 (9, 1) 0 (0, 0) 36 (15, 1) 0 (0, 0) SAG HSG ATA ABA SDR IBR FEY3 POR[A-Z]
UDPGT CL0113 PF00201 150 112 (2, 0) 189 (19, 0) 0 (0, 0) 101 (19, 0) 0 (0, 0)

1HMM Length (a.a.): this value corresponds to the number of amino acid (a.a.) columns in the alignment used for the phylogenetic analysis. The HMM model is provided in the hyperlink. The model is either based on the pFAM model or has been optimised in-house.
2Categories: the phylogenetic trees only contain protein sequences from category 1. Unlike category 2 proteins, category 1 proteins contain a sufficiently complete DNA binding domain; category 3 proteins were present on long terminal branches (>0.8) in a preliminary tree.
3Moss: phylogenetic trees will be available for each family containing the full compliment of proteins from a higher plant (Arabidopsis) and a lower plant (moss) so that the subgroups in common between these diverse plant lineages can be observed easily.
4These trees still need optimising.
5Indicates the families and subfamilies that are confined to the plant kingdom